snowdrop / godaddy-webhook

Cert Manager Godaddy Webhook performing ACME challenge using DNS record
Apache License 2.0
74 stars 63 forks source link

Error presenting challenge: the server could not find the requested resource #36

Closed morzan1001 closed 10 months ago

morzan1001 commented 10 months ago

Hello everyone,

i have a problem requesting a certificate for a domain of mine. i have made the following configuration:

Certificate:

---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: <NAME>
  namespace: cert-manager
spec:
  secretName: <NAME>
  renewBefore: 240h
  issuerRef:
    name: letsencrypt-production
    kind: ClusterIssuer
  commonName: "<NAME>"
  dnsNames:
  - "<NAME>"
  - "<NAME>"

Issuer:

---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <EMAIL>
    privateKeySecretRef:
      name: letsencrypt-production
    solvers:
      - selector:
          dnsZones:
            - "<NAME>"
        dns01:
          webhook:
            config:
              apiKeySecretRef:
                name: godaddy-api-key
                key: token
              production: true
              ttl: 600
            groupName: acme.<NAME>
            solverName: godaddy

SecretKey:

---
apiVersion: v1
kind: Secret
metadata:
  name: godaddy-api-key
  namespace: cert-manager
type: Opaque
stringData:
  token: <TOKEN>

At first i got the error that a resource could not be created for the group i specified. The exact error looked like this:

Error presenting challenge: godaddy.acme.<NAME> is forbidden: User "system:serviceaccount:cert-manager:cert-manager" cannot create resource "godaddy" in API group "acme.<NAME>" at the cluster scope 

I then created a role-binding to set authorizations correctly if necessary. The whole thing looks like this:

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: dns-challenge-missing-role
rules:
- apiGroups: ["acme.<NAME>"]
  resources: ["godaddy"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: dns-challenge-missing-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: dns-challenge-missing-role
subjects:
- kind: ServiceAccount
  name: cert-manager
  namespace: cert-manager

Now I get the following error and can't get any further:

Error presenting challenge: the server could not find the requested resource (post godaddy.acme.<NAME>) 

Maybe someone here has an idea what i have done wrong, i am grateful for any tips :)

cmoulliard commented 10 months ago

Did you see more information from the order resource (https://cert-manager.io/docs/troubleshooting/acme/) ? Which version do you use of:

morzan1001 commented 10 months ago

I am using Kubernetes: v1.27.7 +k3s2 and Cert-Manager: v1.13.2

I took a closer look at the order and challenges and found the following output. I didn't see anything exciting here at first, except of course what was already in the first error message, but maybe I missed something?

kubectl -n cert-manager get issuer
NAME                       READY   AGE
godaddy-webhook-selfsign   True    31h
godaddy-webhook-ca         True    31h
kubectl get clusterissuer
NAME                     READY   AGE
letsencrypt-production   True    31h
kubectl describe clusterissuer letsencrypt-production
Name:         letsencrypt-production
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  cert-manager.io/v1
Kind:         ClusterIssuer
Metadata:
  Creation Timestamp:  2023-12-02T12:57:17Z
  Generation:          3
  Resource Version:    9711456
  UID:                <UUID>
Spec:
  Acme:
    Email:  <EMAIL>
    Private Key Secret Ref:
      Name:  letsencrypt-production
    Server:  https://acme-v02.api.letsencrypt.org/directory
    Solvers:
      dns01:
        Webhook:
          Config:
            API Key Secret Ref:
              Key:       token
              Name:      godaddy-api-key
            Production:  true
            Ttl:         600
          Group Name:    acme.mycompany.com
          Solver Name:   godaddy
      Selector:
        Dns Zones:
          <NAME>
Status:
  Acme:
    Last Private Key Hash:  <HASH>
    Last Registered Email:  <EMAIL>
    Uri:                    https://acme-v02.api.letsencrypt.org/acme/acct/1444635146
  Conditions:
    Last Transition Time:  2023-12-02T12:57:19Z
    Message:               The ACME account was registered with the ACME server
    Observed Generation:   3
    Reason:                ACMEAccountRegistered
    Status:                True
    Type:                  Ready
Events:                    <none>
kubectl -n cert-manager describe order <NAME>
Name:         <NAME>
Namespace:    cert-manager
Labels:       <none>
Annotations:  cert-manager.io/certificate-name: <NAME>
              cert-manager.io/certificate-revision: 1
              cert-manager.io/private-key-secret-name: <NAME>
API Version:  acme.cert-manager.io/v1
Kind:         Order
Metadata:
  Creation Timestamp:  2023-12-02T12:57:36Z
  Generation:          1
  Owner References:
    API Version:           cert-manager.io/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  CertificateRequest
    Name:                  <NAME>
    UID:                   <UID>
  Resource Version:        9700802
  UID:                     <UID>
Spec:
  Common Name:  <NAME>
  Dns Names:
    <NAME>
    <NAME>
  Issuer Ref:
    Kind:   ClusterIssuer
    Name:   letsencrypt-production
  Request:  <REQUESTSTRING>
Status:
  Authorizations:
    Challenges:
      Token:        <TOKEN>
      Type:         dns-01
      URL:          https://acme-v02.api.letsencrypt.org/acme/chall-v3/289667201276/3Ofw7Q
    Identifier:     <NAME>
    Initial State:  pending
    URL:            https://acme-v02.api.letsencrypt.org/acme/authz-v3/289667201276
    Wildcard:       true
    Challenges:
      Token:        <TOKEN>
      Type:         http-01
      URL:          https://acme-v02.api.letsencrypt.org/acme/chall-v3/289667201286/RimpnQ
      Token:        <TOKEN>
      Type:         dns-01
      URL:          https://acme-v02.api.letsencrypt.org/acme/chall-v3/289667201286/qrKBbA
      Token:        <TOKEN>
      Type:         tls-alpn-01
      URL:          https://acme-v02.api.letsencrypt.org/acme/chall-v3/289667201286/sf9IGA
    Identifier:     <NAME>
    Initial State:  pending
    URL:            https://acme-v02.api.letsencrypt.org/acme/authz-v3/289667201286
    Wildcard:       false
  Finalize URL:     https://acme-v02.api.letsencrypt.org/acme/finalize/1444635146/226351673966
  State:            pending
  URL:              https://acme-v02.api.letsencrypt.org/acme/order/1444635146/226351673966
Events:             <none>
kubectl -n cert-manager describe challenge <NAME>
Name:         <NAME>
Namespace:    cert-manager
Labels:       <none>
Annotations:  <none>
API Version:  acme.cert-manager.io/v1
Kind:         Challenge
Metadata:
  Creation Timestamp:  2023-12-02T13:31:42Z
  Finalizers:
    finalizer.acme.cert-manager.io
  Generation:  1
  Owner References:
    API Version:           acme.cert-manager.io/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Order
    Name:                  <NAME>
    UID:                   <UID>
  Resource Version:        9711490
  UID:                     <UID>
Spec:
  Authorization URL:  https://acme-v02.api.letsencrypt.org/acme/authz-v3/289667201276
  Dns Name:           <NAME>
  Issuer Ref:
    Kind:  ClusterIssuer
    Name:  letsencrypt-production
  Key:     <KEY>
  Solver:
    dns01:
      Webhook:
        Config:
          API Key Secret Ref:
            Key:       token
            Name:      godaddy-api-key
          Production:  true
          Ttl:         600
        Group Name:    acme.mycompany.com
        Solver Name:   godaddy
    Selector:
      Dns Zones:
        <NAME>
  Token:     <TOKEN>
  Type:      DNS-01
  URL:       https://acme-v02.api.letsencrypt.org/acme/chall-v3/289667201276/3Ofw7Q
  Wildcard:  true
Status:
  Presented:   false
  Processing:  true
  Reason:      the server is currently unable to handle the request (post godaddy.acme.mycompany.com)
  State:       pending
Events:
  Type     Reason        Age                 From                     Message
  ----     ------        ----                ----                     -------
  Warning  PresentError  14m (x70 over 30h)  cert-manager-challenges  Error presenting challenge: the server is currently unable to handle the request (post godaddy.acme.mycompany.com)
cmoulliard commented 10 months ago

kubectl -n cert-manager describe challenge

Can you paste the content of YAML status please reported by the following command ?

kubectl -n cert-manager get challenge <NAME> -oyaml
morzan1001 commented 10 months ago
status:
  presented: false
  processing: true
  reason: the server is currently unable to handle the request (post godaddy.acme.mycompany.com)
  state: pending

I think this is the same output as in the last output I provided.

cmoulliard commented 10 months ago

Do you see a more interesting message within the log of the Godaddy webhook pod or cert manager pod?

morzan1001 commented 10 months ago

Yes, I think the API itself is not started correctly. The API service has the following status:

status:
  conditions:
    - lastTransitionTime: '2023-12-02T12:50:20Z'
      message: >-
        endpoints for service/godaddy-webhook in "cert-manager" have no
        addresses with port name "https"
      reason: MissingEndpoints
      status: 'False'
      type: Available

The pod itself does not seem to start correctly (?) In any case, I cannot view any logs from the pod:

kubectl -n cert-manager get pods
NAME                                     READY   STATUS             RESTARTS         AGE
cert-manager-cainjector-c86f8699-2f2sv   1/1     Running            0                44h
cert-manager-7d9dc8684c-m5kjh            1/1     Running            0                44h
cert-manager-7d9dc8684c-zcwkw            1/1     Running            0                44h
cert-manager-7d9dc8684c-p9f7q            1/1     Running            0                44h
cert-manager-webhook-f8f64cb85-hsjkb     1/1     Running            0                44h
godaddy-webhook-576c45df5f-cwcrl         0/1     CrashLoopBackOff   528 (103s ago)   44h
cmoulliard commented 10 months ago

So the webhhok's pod cannot start as apparently there is no service or service badly configured to expose the HTTPS port 443. Here is what you should see:

k get pods -A
NAMESPACE              NAME                                        READY   STATUS    RESTARTS       AGE
cert-manager           cert-manager-8484c66d67-b7fhj               1/1     Running   1 (42d ago)    76d
cert-manager           cert-manager-cainjector-54c9d9b775-9zhrm    1/1     Running   1 (42d ago)    87d
cert-manager           cert-manager-webhook-7f7469bdb7-m5b5z       1/1     Running   1 (42d ago)    87d
cert-manager           godaddy-webhook-c6b5f74fd-5w7q6             1/1     Running   1 (42d ago)    87d

k get endpoints -n cert-manager
NAME                   ENDPOINTS             AGE
cert-manager           10.244.61.149:9402    87d
cert-manager-webhook   10.244.61.150:10250   87d
godaddy-webhook        10.244.61.147:443     87d

k get secrets -n cert-manager
NAME                          TYPE                DATA   AGE
cert-manager-webhook-ca       Opaque              3      87d
godaddy-webhook-ca            kubernetes.io/tls   3      87d
godaddy-webhook-webhook-tls   kubernetes.io/tls   3      87d

k get svc -n cert-manager
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
cert-manager           ClusterIP   10.103.153.41   <none>        9402/TCP   87d
cert-manager-webhook   ClusterIP   10.111.63.241   <none>        443/TCP    87d
godaddy-webhook        ClusterIP   10.107.66.226   <none>        443/TCP    87d

Remark: I also suspect that your service cert-manager-webhook is not exposed correctly too as it uses also the port HTTPS = 443.

morzan1001 commented 10 months ago

You are right, it looks like the godaddy-webhook is not accessible via 443. But how do I solve this now?

kubectl get endpoints -n cert-manager
NAME                   ENDPOINTS                                         AGE
cert-manager           10.42.5.25:9402,10.42.8.18:9402,10.42.9.14:9402   45h
cert-manager-webhook   10.42.3.19:10250                                  45h
godaddy-webhook                                                          45h
kubectl get svc -n cert-manager
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
cert-manager-webhook   ClusterIP   10.43.226.12    <none>        443/TCP    45h
cert-manager           ClusterIP   10.43.71.104    <none>        9402/TCP   45h
godaddy-webhook        ClusterIP   10.43.205.193   <none>        443/TCP    45h
cmoulliard commented 10 months ago

Which error do you see (= its log or pod describe) around your godaddy-webhook-576c45df5f-cwcrl pod ?

morzan1001 commented 10 months ago

The log for the pod only says:

exec /usr/local/bin/webhook: exec format error

And with your description I get the following:

Name:             godaddy-webhook-576c45df5f-cwcrl
Namespace:        cert-manager
Priority:         0
Service Account:  godaddy-webhook
Node:             cluster-agent-6/<IP>
Start Time:       Sat, 02 Dec 2023 13:50:21 +0100
Labels:           app.kubernetes.io/instance=godaddy-webhook
                  app.kubernetes.io/name=godaddy-webhook
                  pod-template-hash=<HASH>
Annotations:      <none>
Status:           Running
IP:               10.42.10.24
IPs:
  IP:           10.42.10.24
Controlled By:  ReplicaSet/godaddy-webhook-576c45df5f
Containers:
  godaddy-webhook:
    Container ID:  containerd://b7b2dde9107c1cb99021df50a719681c8d6c5c7f6faa58dda7c5a8fd419c7712
    Image:         quay.io/snowdrop/cert-manager-webhook-godaddy:0.2.0
    Image ID:      quay.io/snowdrop/cert-manager-webhook-godaddy@sha256:bea93fd77c4c5507bdcbf3a60e9ec424b532d00846669f07dec53bbe8fd31b1e
    Port:          443/TCP
    Host Port:     0/TCP
    Args:
      --tls-cert-file=/tls/tls.crt
      --tls-private-key-file=/tls/tls.key
      --secure-port=443
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 04 Dec 2023 11:05:56 +0100
      Finished:     Mon, 04 Dec 2023 11:05:56 +0100
    Ready:          False
    Restart Count:  535
    Liveness:       http-get https://:https/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get https://:https/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      GROUP_NAME:         acme.mycompany.com
      LOGGING_LEVEL:      info
      LOGGING_FORMAT:     color
      LOGGING_TIMESTAMP:  false
    Mounts:
      /tls from certs (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2bc8f (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  godaddy-webhook-webhook-tls
    Optional:    false
  kube-api-access-2bc8f:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                      From     Message
  ----     ------   ----                     ----     -------
  Normal   Pulled   14m (x533 over 45h)      kubelet  Container image "quay.io/snowdrop/cert-manager-webhook-godaddy:0.2.0" already present on machine
  Warning  BackOff  4m20s (x13045 over 45h)  kubelet  Back-off restarting failed container godaddy-webhook in pod godaddy-webhook-576c45df5f-cwcrl_cert-manager(df545dbb-cbe4-4380-bb79-19695d3ce4e5)
cmoulliard commented 10 months ago

exec /usr/local/bin/webhook: exec format error

Such error is typical about a difference between the compiled go application and platform where it runs - https://stackoverflow.com/questions/50785784/standard-init-linux-go178-exec-user-process-caused-exec-format-error-kuberne. What is the architecture of your k8s cluster (X86, ARM, etc) ?

morzan1001 commented 10 months ago

The cluster is running on Raspberry PIs -> Architecture is ARM

cmoulliard commented 10 months ago

The cluster is running on Raspberry PIs -> Architecture is ARM

So we know why godaddy webhook fails to start. I will check if I can change the existing Dockerfile to release it for ARM platform ;-)

Remark: Such a post should help us: https://stackoverflow.com/questions/71372947/is-it-possible-to-manually-build-multi-arch-docker-image-without-docker-buildx

morzan1001 commented 10 months ago

Thank you very much :+1: 🥇

cmoulliard commented 10 months ago

Raspberry PIs

Something else: Which Pi version do you use and ram capacity you set to run your k8s cluster ?

morzan1001 commented 10 months ago

I use Raspberry Pi 4 with 8GB ram, but if I look at the workload, Pis with 4 GB ram would probably be enough. So far everything is pretty stable and the installation was easy with K3s.

cmoulliard commented 10 months ago

I created a PR to build a multi-arch image - https://github.com/snowdrop/godaddy-webhook/pull/37 @morzan1001

cmoulliard commented 10 months ago

Can you make a test using this tagged image please quay.io/snowdrop/cert-manager-webhook-godaddy:pr-37-4b7cbaa@sha256:4bcdaa0f184e904e4c9431bc4b9e70b587e9bc49919ab6f8a2c6211f5d46a703 ? @morzan1001

morzan1001 commented 10 months ago

Things are getting better :D

But unfortunately it doesn't work yet. The container starts now and can also register an API, but unfortunately it is still unhealthy. I can now see the following logs:

 kubectl -n cert-manager logs  godaddy-webhook-64cfdcd4f4-d4p4z
I1204 15:28:40.852556       1 handler.go:232] Adding GroupVersion acme.mycompany.com v1alpha1 to ResourceManager
I1204 15:28:40.864801       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1204 15:28:40.864847       1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController
I1204 15:28:40.864976       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I1204 15:28:40.865016       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1204 15:28:40.865073       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I1204 15:28:40.865092       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1204 15:28:40.866590       1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::/tls/tls.crt::/tls/tls.key"
I1204 15:28:40.867887       1 secure_serving.go:210] Serving securely on [::]:443
I1204 15:28:40.867985       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I1204 15:28:40.965192       1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController
I1204 15:28:40.965243       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1204 15:28:40.965190       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
kubectl -n cert-manager describe pods
[...]
Events:
  Type     Reason     Age    From               Message
  ----     ------     ----   ----               -------
  Normal   Scheduled  8m6s   default-scheduler  Successfully assigned cert-manager/godaddy-webhook-64cfdcd4f4-d4p4z to cluster-agent-6
  Normal   Pulling    8m4s   kubelet            Pulling image "quay.io/snowdrop/cert-manager-webhook-godaddy:0.3.0"
  Normal   Pulled     7m57s  kubelet            Successfully pulled image "quay.io/snowdrop/cert-manager-webhook-godaddy:0.3.0" in 7.446531316s (7.446565483s including waiting)
  Normal   Created    7m57s  kubelet            Created container godaddy-webhook
  Normal   Started    7m56s  kubelet            Started container godaddy-webhook
  Warning  Unhealthy  7m55s  kubelet            Readiness probe failed: Get "https://10.42.10.28:443/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  7m54s  kubelet            Readiness probe failed: Get "https://10.42.10.28:443/healthz": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
kubectl get endpoints -n cert-manager      
NAME                   ENDPOINTS                                         AGE
cert-manager           10.42.5.25:9402,10.42.8.18:9402,10.42.9.14:9402   2d3h
cert-manager-webhook   10.42.3.19:10250                                  2d3h
godaddy-webhook        10.42.10.28:443                                   16m
kubectl -n cert-manager get challenge <NAME> -oyaml
[...]
status:
  presented: false
  processing: true
  reason: the server is currently unable to handle the request (post godaddy.acme.mycompany.com)
  state: pending
morzan1001 commented 10 months ago

Ok I don't think it's an architecture problem anymore, I just looked in my DNS records and a _acme-challenge TXT record was created.

morzan1001 commented 10 months ago

Its working now for me, it just needed a bit more patience :)

cmoulliard commented 10 months ago

Can we then close this ticket as resolve ? @morzan1001

morzan1001 commented 10 months ago

Yes, I thought you wanted to close the issue together with the pull request. Feel free to close the issue :+1:

cmoulliard commented 10 months ago

Fixed with release: 0.3.0 See PR: #37