hetznercloud / hcloud-cloud-controller-manager

Kubernetes cloud-controller-manager for Hetzner Cloud
Apache License 2.0
743 stars 118 forks source link

HCCM not able to extract previosly generated (managed) TLS certificate #709

Open marcopaggioro opened 3 months ago

marcopaggioro commented 3 months ago

TL;DR

It seems that HCCM is not able to see that a certificate already exists (already created by HCCM from a re-created Service or from another Service). If it already exists then it fails and the Services in the load balancer are not produced

Expected behavior

I expect HCCM not to fail if the certificate already exists and was created by itself.

Observed behavior

When I create the Service (annotations below) for the first time I can see the new certificate in the Hetzner Certificate section image

Due to that I can see that HCCM populates even the Services section of my Hetzner Load Balancer. Everything works fine

If i destroy and recrete my service, than HCCM reports these errors

E0803 16:54:50.345120       1 controller.go:298] error processing service traefik/traefik (retrying with exponential backoff): failed to ensure load balancer: hcloud/loadBalancers.EnsureLoadBalancer: hcops/LoadBalancerOps.ReconcileHCLBServices: hcops/hclbServiceOptsBuilder.buildAddServiceOpts: hcops/CertificateOps.GetCertificateByLabel: not found
I0803 16:54:50.345206       1 event.go:389] "Event occurred" object="traefik/traefik" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: hcloud/loadBalancers.EnsureLoadBalancer: hcops/LoadBalancerOps.ReconcileHCLBServices: hcops/hclbServiceOptsBuilder.buildAddServiceOpts: hcops/CertificateOps.GetCertificateByLabel: not found"

Seems like it can't detect that the certificate already exists and it fails.

If I delete the certificate nothing more happens but then If I delete and re-create the Service so HCCM "wake up" and recreated the certificate correctly (with Services in LB).

Minimal working example

Services with these annotations

  annotations:
    load-balancer.hetzner.cloud/certificate-type: managed
    load-balancer.hetzner.cloud/health-check-protocol: tcp
    load-balancer.hetzner.cloud/http-managed-certificate-domains: yourdomain.it,www.yourdomain.it,api.yourdomain.it
    load-balancer.hetzner.cloud/http-managed-certificate-name: https-certificate
    load-balancer.hetzner.cloud/http-redirect-http: 'true'
    load-balancer.hetzner.cloud/name: prod-balancer
    load-balancer.hetzner.cloud/protocol: https

Log output

E0803 16:54:50.345120       1 controller.go:298] error processing service traefik/traefik (retrying with exponential backoff): failed to ensure load balancer: hcloud/loadBalancers.EnsureLoadBalancer: hcops/LoadBalancerOps.ReconcileHCLBServices: hcops/hclbServiceOptsBuilder.buildAddServiceOpts: hcops/CertificateOps.GetCertificateByLabel: not found
I0803 16:54:50.345206       1 event.go:389] "Event occurred" object="traefik/traefik" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: hcloud/loadBalancers.EnsureLoadBalancer: hcops/LoadBalancerOps.ReconcileHCLBServices: hcops/hclbServiceOptsBuilder.buildAddServiceOpts: hcops/CertificateOps.GetCertificateByLabel: not found"


### Additional information

_No response_
apricote commented 3 months ago

The certificate has a label hcloud-ccm/service-uid that must match the kubernetes service uid. You have to update this manually right now.


$ service_uid=$(kubectl get service -n traefik traefik -o=go-template --template='{{ .metadata.uid }}')
$ echo $service_uid
$ hcloud certificate add-label --overwrite https-certificate hcloud-ccm/service-uid=$service_uid
marcopaggioro commented 3 months ago

It sounds like a good workaround but, in fact, is a workaround. No? It sounds no-sense that I have to "patch" the label of the exist certificate every time I re create the Service

apricote commented 3 months ago

Yea, we need a better way to associate the cert with the service and potentially clean it up when necessary.

apricote commented 3 months ago

Suggestions by @micheljung in #596:

github-actions[bot] commented 2 weeks ago

This issue has been marked as stale because it has not had recent activity. The bot will close the issue if no further action occurs.

marcopaggioro commented 2 weeks ago

problem still exists