Closed mozai closed 9 months ago
According to cert-manager doccos about ClusterIssuers, the selector
configuration for a solver should be optional, so I don't expect the software to freak out if it's absent.
The selector
CertificateDNSNameSelector can have up to three keys: matchLabels, dnsNames, and dnsZones, and it doesn't mention that wildcards like *.example.com
can be used in dnsNames. I'll try using dnsZones: ['mozai.com']}
instead.
... yes, cert-manager is now reporting
I1205 01:21:10.018369 1 dns.go:88] "cert-manager/challenges/Present: presenting DNS01 challenge for domain" resource_name="grafana-tls-1-3645741348-3450699660" resource_namespace="monitoring" resource_kind="Challenge" resource_version="v1" dnsName="grafana.mozai.com" type="DNS-01" resource_name="grafana-tls-1-3645741348-3450699660" resource_namespace="monitoring" resource_kind="Challenge" resource_version="v1" domain="grafana.mozai.com" E1205 01:21:10.027946 1 controller.go:167] "cert-manager/challenges: re-queuing item due to error processing" err="the server is currently unable to handle the request (post godaddy.acme.mozai.com)" key="monitoring/grafana-tls-1-3645741348-3450699660"
But the godaddy-webhook pod is still crashingi and dumping stack traces starting with
E1205 01:23:30.073065 1 runtime.go:77] Observed a panic: runtime error: index out of range [1] with length 1
Here is an example about a Certificate + Issuer that I' m using to renew the following
*.apps.qshift
+ domain name =>*.apps.qshift.snowdrop.dev
successfully:
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt-prod-qshift-snowdrop-dev
labels:
app: qshift-ca-cert
namespace: snowdrop-site
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email:
privateKeySecretRef:
name: letsencrypt-prod-snowdrop-dev
solvers:
- dns01:
webhook:
config:
apiKeySecretRef:
name: godaddy-api-key
key: token
production: true
ttl: 600
groupName: acme.mycompany.com
solverName: godaddy
selector:
dnsNames:
- '*.apps.qshift.snowdrop.dev'
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: qshift-snowdrop-dev
labels:
app: qshift-ca-cert
namespace: snowdrop-site
spec:
renewBefore: 2136h
duration: 2190h
privateKey:
size: 2048
algorithm: RSA
issuerRef:
kind: Issuer
name: letsencrypt-prod-qshift-snowdrop-dev
secretName: qshift-snowdrop-dev-tls
dnsNames:
- '*.apps.qshift.snowdrop.dev'
In your example, dnsNames
in Issuer and Certificate are an exact match, so that Issuer will only work with one Certificate, it won't work with any other Certificate. In my bug report I said I "grafana.mozai.com" doesn't get selected for dnsNames: ['*.mozai.com']
It seems odd to me to require (manually) making one Issuer for each Certificate, so I expected the wildcard in the selector to match multiple Certificates. I assumed incorrectly. But even when I deviate from the instructions in Readme.md to write a selector that does match... godaddy-webhook still crashes repeatedly with that "index out of range [1] with length 1"
Which version of k8s do you use to run cert manager and godaddy webhook ?
current stable in Google Cloud Platform: 1.27
{Major:"1", Minor:"27", GitVersion:"v1.27.3-gke.100", GitCommit:"6466b51b762a5c49ae3fb6c2c7233ffe1c96e48c", GitTreeState:"clean", BuildDate:"2023-06-23T09:27:28Z", GoVersion:"go1.20.5 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}
We should then do different things:
index out of range [1] with length 1
as the log don't really help(I'm on day three of the flu, but I have a deadline, so please excuse if I'm rambling.)
If I'm reading the stack trace correctly (and it should be recent-to-oldest like a callstack), then the calls are:
Present() does call extractApiTokenFromSecret(), and checking the pointer addresses for *ChallengeRequest in both calls they're the same, so I will conclude the panic happens inside extractApiTokenFromSecret()
There's three places where an array offset operation takes place:
secBytes, ok := sec.Data[cfg.APIKeySecretRef.Key]
sec here is a kubernetes.Clientset.CoreV1
object, probably a map (aka associative array) not an arraytoken := strings.Split(string(secBytes), ":")
cfg.AuthAPIKey = token[0]
cfg.AuthAPISecret = token[1]
secBytes
comes from sec.Data[cfg.APIKeySecretRef.Key]
and sec comes from `kubernetes.Clientset.CoreV1.Secrets().Get(...,cfg.APIKeySecretRef.LocalObjectReference.Name,...) so this is probably the k8s Secret that holds the Godaddy API key+secret.
... mother of God.
$ kubectl get secret -n cert-manager godaddy-api-key -o yaml
data:
token: PEdPREFERFlfQVBJX0tFWV9IRVJFPg==
kind: Secret
$ base64 -d <<<'PEdPREFERFlfQVBJX0tFWV9IRVJFPg==';echo
<GODADDY_API_KEY_HERE>
OKAY! So it's crashing because there's no ":" in the .data.token part of the k8s Secret. Installed a proper "key:secret", no more crashes, and certs are getting validated.
This is now a bug report about poor handling of bad input data.
OKAY! So it's crashing because there's no ":" in the .data.token part of the k8s Secret. Installed a proper "key:secret", no more crashes, and certs are getting validated.
So you fixed your problem ?
it should be token : api_key:secret or just token: api_key ?
it should be token : api_key:secret or just token: api_key ?
As documented here - https://github.com/snowdrop/godaddy-webhook?tab=readme-ov-file#secret it should be api_key:api_secret
when i used api_key:api_secret i had this error:
daddywebhook: Observed a panic: runtime error: index out of range [1] with length 1
cert-manager: E0108 19:44:15.3025611 controller.go:167] "cert-manager/challenges: re-queuing item due to error processing" err="the server is currently unable to handle the request (post godaddy.acme.mycompany.com)" key="default/wildcard-adeiz-com-tls-1-1087293611-828888654"
Server Version: v1.26.7
(updated)
Using: godaddy-webhook 0.3.0 and jetstack/cert-manager:1.13.2
When I use the ClusterIssuer defined in README.md , changing
*.example.com
for*.mozai.com
andgroupName: acme.mycompany.com
forgroupName: acme.mozai.com
, cert-manager complains:Thinking the
selector
block in the ClusterIssuer's only solver might be what's not matching, I get a constant stream of golang panic stacktraces, starting with this error:What is causing "no solvers can be used for this challenge? Surely
selector: { dnsNames: ['*.mozai.com'] }
does match grafana.mozai.com so is there somewhere else I should look? and should the software be crashing that hard and that frequently if the selector is absent?