caddyserver / certmagic

Automatic HTTPS for any Go program: fully-managed TLS certificate issuance and renewal
https://pkg.go.dev/github.com/caddyserver/certmagic?tab=doc
Apache License 2.0
5k stars 289 forks source link

Retain the error stack if `checkIfCertShouldBeObtained` returns an error #256

Closed ankon closed 11 months ago

ankon commented 11 months ago

This allows an outside caller of GetCertificate to use errors.As to check for their own response and react accordingly.

In our case: The ondemand decision func checks a blocklist and might return something like BlockedDomainError{message:"This is blocked"}. This isn't an "error", it's just something that so happens to look like an error because that's what the decision func wants us to return.

In the GetCertificate implementation we call certmagic, and check the error it returns. If the error is "expected", then we don't really care about it and might just count it. If it is something interesting, we might want to tell a human about it. The way that works is by having the GetCertificate function in a NewRelic "Transaction", and when we receive an error from certmagic we check whether the error is of one of the "normal" types, and if so call txn.NoticeExpectedError(err). If it is interesting, we call NoticeError(err).

This doesn't work for the errors from the decision func, because those lose their causal trace. The PR fixes that by using %w instead of %v to transmit the error outwards.

ankon commented 11 months ago

Example code illustrating what we do:

  1. Control-flow error type BlockedHost:
    
    type BlockedHost struct {
    name string
    }

func (bh *BlockedHost) Error() string { return fmt.Sprintf("%s is blocked", bh.name) }

2. Return that from the decision func, and check in GetCertificate:
```golang
    certmagicConfig.OnDemand = &certmagic.OnDemandConfig{
        DecisionContextFunc: func(ctx context.Context, name string) error {
            // ...              
            if (shouldBlockHost(name)) {
                return &BlockedHost{name}
            }
            // ...
        },
        // ...
    }

    tlsConfig.GetCertificate = func(clientHelloInfo *tls.ClientHelloInfo) (*tls.Certificate, error) {
        txn := newrelicApp.StartTransaction("TLS/GetCertificate")
        defer txn.End()
        // Add attributes related to the request
        txn.AddAttribute("hostname", clientHelloInfo.ServerName)

        // Get a certificate
        ctx := newrelic.NewContext(clientHelloInfo.Context(), txn)
        cert, err := certmagicConfig.GetCertificateWithContext(ctx, clientHelloInfo)
        if err != nil {
            var bhe *BlockedHost
            if errors.As(err, &bhe) {
                // Fine, happens once in while.
                txn.NoticeExpectedError(err)
            } else {
                // Something else!
                txn.NoticeError(err)
            }
        }
        // ...
    }
ankon commented 11 months ago

Will do it quickly.

FWIW: Instead of wrapping we could also just let the error get passed through completely with modification, which seems to be the case in most other places?

mholt commented 11 months ago

FWIW: Instead of wrapping we could also just let the error get passed through completely with modification, which seems to be the case in most other places?

Do you mean without modification? i.e. return err? That's definitely possible, the only downside is that we lose the natural "stack trace" that the error string gives us, like where in the code the errors are coming from.

Personally I'm happy with this as-is now, but am happy to approve another way if you'd prefer.

ankon commented 11 months ago

Do you mean without modification?

Indeed, I meant that. But, for now this is great, too -- there's always room to iterate further here if needed :)