redhat-cop / cert-utils-operator

Set of functionalities around certificates packaged in a Kubernetes operator
Apache License 2.0
94 stars 35 forks source link

unable to update route error #128

Closed tchellomello closed 2 years ago

tchellomello commented 2 years ago

I'm testing the cert-utils-operator on Openshift 4.9.21 and I'm hitting the error below:

[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-28T18:43:58.465Z  ERROR   controllers.route_certificate_controller    unable to update route  {"route-certificate": "default/wildcard-https", "route": {"apiVersion": "route.openshift.io/v1", "kind": "Route", "namespace": "default", "name": "wildcard-https"}, "error": "Operation cannot be fulfilled on routes.route.openshift.io \"wildcard-https\": the object has been modified; please apply your changes to the latest version and try again"} 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] github.com/go-logr/zapr.(*zapLogger).Error 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] github.com/redhat-cop/cert-utils-operator/controllers/route.(*RouteCertificateReconciler).Reconcile 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/work/cert-utils-operator/cert-utils-operator/controllers/route/route_controller.go:213 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:298 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:253 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:216 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:185 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:155 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.BackoffUntil 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:156 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.JitterUntil 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:133 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:185 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.UntilWithContext 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:99 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-28T18:43:58.465Z  DEBUG   util.api    object is not ConditionsAware, not setting status 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-28T18:43:58.465Z  ERROR   controller-runtime.manager.controller.route Reconciler error    {"reconciler group": "route.openshift.io", "reconciler kind": "Route", "name": "wildcard-https", "namespace": "default", "error": "Operation cannot be fulfilled on routes.route.openshift.io \"wildcard-https\": the object has been modified; please apply your changes to the latest version and try again"} 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] github.com/go-logr/zapr.(*zapLogger).Error 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:302 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:253 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:216 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:185 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:155 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.BackoffUntil 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:156 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.JitterUntil 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:133 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:185 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] k8s.io/apimachinery/pkg/util/wait.UntilWithContext 
[cert-utils-operator-controller-manager-644575f449-ff67v manager]   /home/runner/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:99 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-28T18:43:58.465Z  DEBUG   controller-runtime.manager.events   Warning {"object": {"kind":"Route","namespace":"default","name":"wildcard-https","uid":"8b51b5c7-7a20-4585-bac0-8bef898de398","apiVersion":"route.openshift.io/v1","resourceVersion":"1174098"}, "reason": "ProcessingError", "message": "Operation cannot be fulfilled on routes.route.openshift.io \"wildcard-https\": the object has been modified; please apply your changes to the latest version and try again"} 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-28T18:43:58.519Z  DEBUG   util.api    object is not ConditionsAware, not setting status 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-28T18:43:58.519Z  DEBUG   util.api    object is not ConditionsAware, not setting status 

Some of the artifacts that I'm using:

custom domain

---
apiVersion: managed.openshift.io/v1alpha1
kind: CustomDomain
metadata:
  name: testing-local
spec:
  domain: testing.local
  certificate:
    name: wildcard-testing-local-tls-cert
    namespace: default

## patching
kubectl -n openshift-ingress-operator patch ingresscontroller testing-local -p '{"spec":{"routeAdmission":{"wildcardPolicy":"WildcardsAllowed"}}}' --type=merge
kubectl -n openshift-ingress-operator patch ingresscontroller default -p '{"spec":{"routeAdmission":{"wildcardPolicy":"WildcardsAllowed"}}}' --type=merge
kubectl -n openshift-ingress-operator patch ingresscontroller testing-local -p '{"spec":{"domain":"testing.local"}}' --type=merge
kubectl -n openshift-ingress-operator patch ingresscontroller testing-local -p '{"spec":{"routeSelector":{"matchLabels":{"type":"testing-local"}}}}' --type=merge

certificate

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard
  namespace: default
spec:
  secretName: wildcard-testing-local-tls-cert
  duration: 2160h # 90d
  renewBefore: 360h # 15d
  subject:
    organizations:
      - Celonis
  commonName: wildcard.testing.local
  isCA: false
  privateKey:
    algorithm: RSA
    encoding: PKCS1
    size: 2048
  usages:
    - server auth
    - client auth
  dnsNames:
    - wildcard.testing.local
    - "*.testing.local"
  issuerRef:
    name: testing-local-ca
    kind: ClusterIssuer
    group: cert-manager.io
  secretTemplate:
    annotations:
      # https://github.com/redhat-cop/cert-utils-operator#generating-kubernetes-events
      cert-utils-operator.redhat-cop.io/generate-cert-expiry-alert: "true"
      cert-utils-operator.redhat-cop.io/cert-expiry-check-frequency: "7d"           # (days)  with which frequency should the system check is a certificate is expiring
      cert-utils-operator.redhat-cop.io/cert-soon-to-expire-check-frequency: "1h"   # (hours) with which frequency should the system check is a certificate is expired, once it's close to expiring
      cert-utils-operator.redhat-cop.io/cert-soon-to-expire-threshold: "2160h"      # (90 days)  what is the interval of time below which we consider the certificate close to expiry

route

---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  labels:
    app: hello-world
    type: testing-local
  name: wildcard-https
  annotations:
    cert-utils-operator.redhat-cop.io/inject-CA: "false"
    cert-utils-operator.redhat-cop.io/certs-from-secret: "wildcard-testing-local-tls-cert"
    route.openshift.io/termination: edge
spec:
  host: wildcard.testing.local
  port:
    targetPort: 8080-tcp
  tls:
    termination: edge
    insecureEdgeTerminationPolicy: Redirect
  to:
    kind: Service
    name: hello-world
    weight: 100
  wildcardPolicy: Subdomain

Running a curl command I can see the certificate without problems, but why is the operator throwing this error?

Thanks mmello

raffaelespazzoli commented 2 years ago

usually that error is transient, and it happens when two operators update an object at the same time. It's the optimistic locking mechanism inside the go-client. Retrying usually fixes the error, so I was going to ask if eventually the operator was able to put the certs in the secret. It looks like now from the manifest you shared, but can you confirm? anything else suspect in the log. The reason why you see the secret is that you are setting it as the ingress default (so that every route will get it unless a route specific in defined) and you are setting it as a route specific secret. You only need it once, being it a wildcard secret, it should probably be set as the default cert in the ingress.

tchellomello commented 2 years ago

thanks for your quick response @raffaelespazzoli

yes, the cert-manager operator created and populate the wildcard-testing-local-tls-cert secret and then which was later consumed by the route. If I inspect the route object, I can see the cert-utils-operator

kubectl get routes wildcard-https -o yaml | kubectl neat
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    cert-utils-operator.redhat-cop.io/certs-from-secret: wildcard-testing-local-tls-cert
    cert-utils-operator.redhat-cop.io/inject-CA: "false"
    route.openshift.io/termination: edge
  labels:
    app: hello-world
    type: testing-local
  name: wildcard-https
  namespace: default
spec:
  host: wildcard.testing.local
  port:
    targetPort: 8080-tcp
  tls:
    certificate: |
      -----BEGIN CERTIFICATE-----
      XXXX
      -----END CERTIFICATE----
    insecureEdgeTerminationPolicy: Redirect
    key: |
      -----BEGIN RSA PRIVATE KEY-----
      xxxxx
      -----END RSA PRIVATE KEY-----
    termination: edge
  to:
    kind: Service
    name: hello-world
    weight: 100
  wildcardPolicy: Subdomain

I also can confirm that whenever I renew the certificate via cert-manager, the operator does it job updating the secret.

For example:

before

$ curl -k -L  -v https://b.testing.local 2>&1 >/dev/null   | grep 'date:' | head -n2
*  start date: Mar 28 18:46:39 2022 GMT
*  expire date: Jun 26 18:46:39 2022 GMT

refresh cert

$ kubectl cert-manager renew -n default wildcard
Manually triggered issuance of Certificate default/wildcard

after

$ curl -k -L  -v https://b.testing.local 2>&1 >/dev/null   | grep 'date:' | head -n2
*  start date: Mar 29 14:36:38 2022 GMT
*  expire date: Jun 27 14:36:38 2022 GMT

Upon refresh, the cert-utils-operator logs seem to be fine

[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-29T14:44:52.272Z  DEBUG   util.api    object is not ConditionsAware, not setting status 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-29T14:44:52.272Z  DEBUG   util.api    object is not ConditionsAware, not setting status 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-29T14:45:10.923Z  DEBUG   controller-runtime.manager.events   Warning {"object": {"kind":"Secret","namespace":"default","name":"wildcard-testing-local-tls-cert","uid":"2c0cc2b1-f326-48e6-b0f2-9422857cd3cc","apiVersion":"v1","resourceVersion":"1553226"}, "reason": "Certs Soon to Expire", "message": "Certificate expiring in 89 days"} 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-29T14:45:10.989Z  DEBUG   util.api    object is not ConditionsAware, not setting status 
[cert-utils-operator-controller-manager-644575f449-ff67v manager] 2022-03-29T14:45:10.990Z  DEBUG   util.api    object is not ConditionsAware, not setting status 

So @raffaelespazzoli is it safe and expected then to assume the race condition error? Thanks

tchellomello commented 2 years ago

Thanks, @raffaelespazzoli for your assistance.

I'm closing this issue.