snowdrop / godaddy-webhook

Cert Manager Godaddy Webhook performing ACME challenge using DNS record
Apache License 2.0
74 stars 63 forks source link

Observed a panic: runtime error: index out of range [1] with length 1 #38

Closed mozai closed 9 months ago

mozai commented 10 months ago

Using: godaddy-webhook 0.3.0 and jetstack/cert-manager:1.13.2

When I use the ClusterIssuer defined in README.md , changing *.example.com for *.mozai.com and groupName: acme.mycompany.com for groupName: acme.mozai.com, cert-manager complains:

"cert-manager/orders: Failed to determine the list of Challenge resources needed for the Order" err="no configured challenge solvers can be used for this challenge" resource_name="grafana-tls-1-3645741348" resource_namespace="monitoring" resource_kind="Order" resource_version="v1"

Thinking the selector block in the ClusterIssuer's only solver might be what's not matching, I get a constant stream of golang panic stacktraces, starting with this error:

runtime.go:77] Observed a panic: runtime error: index out of range [1] with length 1 goroutine 2548 [running]: k8s.io/apiserver/pkg/endpoints/handlers/finisher.finishRequest.func1.1() /go/pkg/mod/k8s.io/apiserver@v0.27.4/pkg/endpoints/handlers/finisher/finisher.go:105 +0xa5

What is causing "no solvers can be used for this challenge? Surely selector: { dnsNames: ['*.mozai.com'] } does match grafana.mozai.com so is there somewhere else I should look? and should the software be crashing that hard and that frequently if the selector is absent?

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-godaddy-prod
spec:
  acme:
    email: hostmaster.mozai.com  # dont post email addresses to webpages
    privateKeySecretRef:
      name: letsencrypt-godaddy-prod
    server: https://acme-v02.api.letsencrypt.org/directory
    solvers:
    - dns01:
        webhook:
          config:
            apiKeySecretRef:
              name: godaddy-api-key
              key: token
            production: true
            ttl: 600
          groupName: acme.mozai.com
          solverName: godaddy
      selector: {dnsNames: ['*.mozai.com']}
mozai commented 10 months ago

According to cert-manager doccos about ClusterIssuers, the selector configuration for a solver should be optional, so I don't expect the software to freak out if it's absent.
The selector CertificateDNSNameSelector can have up to three keys: matchLabels, dnsNames, and dnsZones, and it doesn't mention that wildcards like *.example.com can be used in dnsNames. I'll try using dnsZones: ['mozai.com']} instead.

... yes, cert-manager is now reporting

I1205 01:21:10.018369 1 dns.go:88] "cert-manager/challenges/Present: presenting DNS01 challenge for domain" resource_name="grafana-tls-1-3645741348-3450699660" resource_namespace="monitoring" resource_kind="Challenge" resource_version="v1" dnsName="grafana.mozai.com" type="DNS-01" resource_name="grafana-tls-1-3645741348-3450699660" resource_namespace="monitoring" resource_kind="Challenge" resource_version="v1" domain="grafana.mozai.com" E1205 01:21:10.027946 1 controller.go:167] "cert-manager/challenges: re-queuing item due to error processing" err="the server is currently unable to handle the request (post godaddy.acme.mozai.com)" key="monitoring/grafana-tls-1-3645741348-3450699660"

But the godaddy-webhook pod is still crashingi and dumping stack traces starting with

E1205 01:23:30.073065 1 runtime.go:77] Observed a panic: runtime error: index out of range [1] with length 1

cmoulliard commented 10 months ago

Here is an example about a Certificate + Issuer that I' m using to renew the following *.apps.qshift + domain name =>*.apps.qshift.snowdrop.dev successfully:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-prod-qshift-snowdrop-dev
  labels:
    app: qshift-ca-cert
  namespace: snowdrop-site
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email:
    privateKeySecretRef:
      name: letsencrypt-prod-snowdrop-dev
    solvers:
      - dns01:
          webhook:
            config:
              apiKeySecretRef:
                name: godaddy-api-key
                key: token
              production: true
              ttl: 600
            groupName: acme.mycompany.com
            solverName: godaddy
        selector:
          dnsNames:
            - '*.apps.qshift.snowdrop.dev'
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: qshift-snowdrop-dev
  labels:
    app: qshift-ca-cert
  namespace: snowdrop-site
spec:
  renewBefore: 2136h
  duration: 2190h
  privateKey:
    size: 2048
    algorithm: RSA
  issuerRef:
    kind: Issuer
    name: letsencrypt-prod-qshift-snowdrop-dev
  secretName: qshift-snowdrop-dev-tls
  dnsNames:
    - '*.apps.qshift.snowdrop.dev'
mozai commented 10 months ago

In your example, dnsNames in Issuer and Certificate are an exact match, so that Issuer will only work with one Certificate, it won't work with any other Certificate. In my bug report I said I "grafana.mozai.com" doesn't get selected for dnsNames: ['*.mozai.com']

It seems odd to me to require (manually) making one Issuer for each Certificate, so I expected the wildcard in the selector to match multiple Certificates. I assumed incorrectly. But even when I deviate from the instructions in Readme.md to write a selector that does match... godaddy-webhook still crashes repeatedly with that "index out of range [1] with length 1"

mozai commented 10 months ago

issuer-certificate-pod.txt godaddy-webhook.log

cmoulliard commented 10 months ago

Which version of k8s do you use to run cert manager and godaddy webhook ?

mozai commented 10 months ago

current stable in Google Cloud Platform: 1.27 {Major:"1", Minor:"27", GitVersion:"v1.27.3-gke.100", GitCommit:"6466b51b762a5c49ae3fb6c2c7233ffe1c96e48c", GitTreeState:"clean", BuildDate:"2023-06-23T09:27:28Z", GoVersion:"go1.20.5 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}

cmoulliard commented 10 months ago

We should then do different things:

mozai commented 10 months ago

(I'm on day three of the flu, but I have a deadline, so please excuse if I'm rambling.)

If I'm reading the stack trace correctly (and it should be recent-to-oldest like a callstack), then the calls are:

Present() does call extractApiTokenFromSecret(), and checking the pointer addresses for *ChallengeRequest in both calls they're the same, so I will conclude the panic happens inside extractApiTokenFromSecret()

There's three places where an array offset operation takes place:

... mother of God.

$ kubectl get secret -n cert-manager godaddy-api-key -o yaml
data:
  token: PEdPREFERFlfQVBJX0tFWV9IRVJFPg==
kind: Secret
$ base64 -d <<<'PEdPREFERFlfQVBJX0tFWV9IRVJFPg==';echo
<GODADDY_API_KEY_HERE>

OKAY! So it's crashing because there's no ":" in the .data.token part of the k8s Secret. Installed a proper "key:secret", no more crashes, and certs are getting validated.

This is now a bug report about poor handling of bad input data.

cmoulliard commented 10 months ago

OKAY! So it's crashing because there's no ":" in the .data.token part of the k8s Secret. Installed a proper "key:secret", no more crashes, and certs are getting validated.

So you fixed your problem ?

Siradjedd commented 9 months ago

it should be token : api_key:secret or just token: api_key ?

cmoulliard commented 9 months ago

it should be token : api_key:secret or just token: api_key ?

As documented here - https://github.com/snowdrop/godaddy-webhook?tab=readme-ov-file#secret it should be api_key:api_secret

Siradjedd commented 9 months ago

when i used api_key:api_secret i had this error:

daddywebhook: Observed a panic: runtime error: index out of range [1] with length 1

cert-manager: E0108 19:44:15.3025611 controller.go:167] "cert-manager/challenges: re-queuing item due to error processing" err="the server is currently unable to handle the request (post godaddy.acme.mycompany.com)" key="default/wildcard-adeiz-com-tls-1-1087293611-828888654"

Server Version: v1.26.7 (updated)