aureq / cert-manager-webhook-ovh

OVH Webhook for Cert Manager
https://aureq.github.io/cert-manager-webhook-ovh/
Apache License 2.0
80 stars 14 forks source link

❓️Migrating from `baarde/cert-manager-webhook-ovh` 0.3.0 to `aureq/cert-manager-webhook-ovh` 0.4.2 #29

Closed xakraz closed 1 year ago

xakraz commented 1 year ago

What happened?

Overview

We have just migrated our deployment from baarde/cert-manager-webhook-ovh 0.3.0 to aureq/cert-manager-webhook-ovh 0.4.2

Our setup was working so far with baarde/cert-manager-webhook-ovh 0.3.0 😄

Today, several certificates should have been renewed. However, that is not the case and I wonder how we can debug/troubleshoot the situation.

Details

Certificate Status

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  annotations:
 ...
  name: CERT_NAME
  namespace: *******
...
spec:
  ...
status:
  conditions:
  - lastTransitionTime: "2022-09-02T11:23:50Z"
    message: Certificate is up to date and has not expired
    observedGeneration: 1
    reason: Ready
    status: "True"
    type: Ready
  - lastTransitionTime: "2023-09-26T13:21:19Z"
    message: Renewing certificate as renewal was scheduled at 2023-09-26 13:21:19
      +0000 UTC
    observedGeneration: 1
    reason: Renewing
    status: "True"
    type: Issuing
  nextPrivateKeySecretName: *****-tls-26xdd
  notAfter: "2023-10-26T13:21:19Z"
  notBefore: "2023-07-28T13:21:20Z"
  renewalTime: "2023-09-26T13:21:19Z"
  revision: 6

Certificate request

apiVersion: cert-manager.io/v1
kind: CertificateRequest
metadata:
...
status:
  conditions:
  - lastTransitionTime: "2023-09-26T13:21:19Z"
    message: Certificate request has been approved by cert-manager.io
    reason: cert-manager.io
    status: "True"
    type: Approved
  - lastTransitionTime: "2023-09-26T13:21:19Z"
    message: 'Waiting on certificate issuance from order ****/****-tls-vjjvt-1840340888:
      "pending"'
    reason: Pending
    status: "False"
    type: Ready

Order

apiVersion: acme.cert-manager.io/v1
kind: Order
...
status:
  authorizations:
  - challenges:
    - token: *********************
      type: dns-01
      url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/268160725656/fcUzsQ
    identifier: DOMAIN_NAME
    initialState: pending
    url: https://acme-v02.api.letsencrypt.org/acme/authz-v3/268160725656
    wildcard: true
  - challenges:
    - token: *********************
      type: http-01
      url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/268160725666/Uau2uA
    - token: *********************
      type: dns-01
      url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/268160725666/F1ZFNQ
    - token: *********************
      type: tls-alpn-01
      url: https://acme-v02.api.letsencrypt.org/acme/chall-v3/268160725666/w1QZrw
    identifier: DOMAIN_NAME
    initialState: pending
    url: https://acme-v02.api.letsencrypt.org/acme/authz-v3/268160725666
    wildcard: false
  finalizeURL: https://acme-v02.api.letsencrypt.org/acme/finalize/712492067/211011220836
  state: pending
  url: https://acme-v02.api.letsencrypt.org/acme/order/712492067/211011220836

Logs

In cert-manager-webhook-ovh we have a lot of these logs

cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh I0926 21:03:47.827327       1 main.go:120] cert-manager "msg"="Starting challenge request..." 
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh I0926 21:03:47.827369       1 main.go:126] cert-manager "msg"="Resource namespace: cert-manager" 
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh I0926 21:03:47.827378       1 main.go:96] cert-manager "msg"="Validating provider config..." 
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh W0926 21:03:55.789270       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169: failed to list *v1beta3.FlowSchema: the server could not find the requested resource
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh E0926 21:03:55.789329       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169: Failed to watch *v1beta3.FlowSchema: failed to list *v1beta3.FlowSchema: the server could not find the requested resource
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh W0926 21:04:04.196499       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169: failed to list *v1beta3.PriorityLevelConfiguration: the server could not find the requested resource
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh E0926 21:04:04.196556       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169: Failed to watch *v1beta3.PriorityLevelConfiguration: failed to list *v1beta3.PriorityLevelConfiguration: the server could not find the requested resource
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh I0926 21:04:29.017214       1 main.go:120] cert-manager "msg"="Starting challenge request..." 
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh I0926 21:04:29.017279       1 main.go:126] cert-manager "msg"="Resource namespace: cert-manager" 
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh I0926 21:04:29.017286       1 main.go:96] cert-manager "msg"="Validating provider config..." 
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh W0926 21:04:50.221616       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169: failed to list *v1beta3.FlowSchema: the server could not find the requested resource
cert-manager-webhook-ovh-b5554d756-khlj5 cert-manager-webhook-ovh E0926 21:04:50.221826       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.0/tools/cache/reflector.go:169: Failed to watch *v1beta3.FlowSchema: failed to list *v1beta3.FlowSchema: the server could not find the requested resource

Expected Behavior

The certificate to be renewed successfully

Steps to reproduce

Deployment is made through kustomize with the HelmChatInflator

# kustomization.yaml
....

helmCharts:
- name: cert-manager-webhook-ovh
  repo: https://aureq.github.io/cert-manager-webhook-ovh/
  version: 0.4.2
  releaseName: cert-manager-webhook-ovh
  valuesFile: config/values.yaml
  namespace: cert-manager-webhook-ovh

...
# values.yaml
groupName: DOMAIN_NAME

certManager:
  namespace: cert-manager
  serviceAccountName: cert-manager

service:
  type: ClusterIP
  port: 443

resources:
  limits:
   cpu: 100m
   memory: 128Mi
  requests:
   cpu: 100m
   memory: 128Mi

nodeSelector: {}
tolerations: []
affinity: {}

We have the RBAC setup required from baarde/cert-manager-webhook-ovh 0.3.0 + custom ClusterIssuer

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod-dns-ovh
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: OBFUSCATED_EMAIL
    privateKeySecretRef:
      name: letsencrypt-prod-dns-ovh
    solvers:
    - dns01:
        webhook:
          groupName: DOMAIN_NAME
          solverName: ovh
          config:
            endpoint: ovh-eu
            applicationKey: *******
            applicationSecretRef:
              key: applicationSecret
              name: ovh-credentials
            consumerKey: **********

Versions in use

Cert-manager-webhook-ovh

0.4.2

Kubernetes version

Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.13-gke.200", GitCommit:"f24d4fe14dcec505c2b37cc1e5b7024d971f6360", GitTreeState:"clean", BuildDate:"2023-08-25T09:26:18Z", GoVersion:"go1.20.7 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

aureq commented 1 year ago

Hi @xakraz

Thanks for the detailed report and for providing the values.yaml.

You may not have noticed, but values.yaml has had some significant changes since. So it's not possible to use the settings for 0.3.0 onto 0.4.2.

My recommendations are:

I would also recommend reading the CHANGELOG.md for more details on the changes.

Side note: values.yaml will change again quite a bit until I reach a more stable 1.0.0.

xakraz commented 1 year ago

Hi @aureq

Thank you for your quick reply 🙏

I found out and solved my issues yesterday evening. I will post a more detailed comment about it later today.

Many thanks again for the time you spend on that project.

xakraz commented 1 year ago

Updates regarding the issue

1 - Logs issue

✅ The error messages reported in the logs have been fixed by upgrading to 0.5.0 of cert-manager-webhook-ovh

2 - Cerrificate renewal issue

While troubleshooting the Certificate renewal chain, the Challenge displayed errors and complains about not being able to get the applicationKey (not found).

✅ The fix was to use the secretRef format for every 3 credentials properties (applicationKey, applicationSecret, and consumerKey)

As you might have noticed, or not, we had a mixed syntax in our ClusterIssuer spec. Only the applicationSecret was using the secretRef format. The 2 other properties were inline, plain text defined, and was working with baarde/cert-manager-webhook-ovh 0.3.0 😅

Thanks again @aureq for your time and support 🙏🏻