Closed alex-sentsiurou closed 3 years ago
Same experience on OCP 4.3
@tnozicka, this is a blocker. If you don' have time on your hands to fix it, could you maybe point us in the area of the code that needs attention for a solution?
Same experience on OCP 4.5
Same experience on OCP 4.5 is there already a workaround?
we couldn't solve it, we've also tried cert-manager but didn't work either.
Can you attach the full (redacted) yaml for those routes? I have tried with a few days old OCP 4.7-ci in AWS and the cert was provisioned and the temporary route was deleted.
It this happening consistently for you? Does it recover when you delete the temporary routes? What is the version of openshift-acme you are using?
Hey @tnozicka. There is a race condition and the acme controller creates the same route again which gets rejected by OCP since an older one declared the same Route. It's nice to see you test this but don't you have an OCP release that is GA to compare with what we've been experiencing? I'll let somebody else give an example.
Hi, these are the 2 exposer routes and the external which should get the cert
openshift-acme_notWorking_routes.txt
name: docpipe5-external
Hi,
I don't think this depends on OCP version used but more on how slow your infra/informers are and if the informers see the update before next sync loop on the same item. It should generate the same name for the same challenge and not to create a new one which is how it avoids the race. https://github.com/tnozicka/openshift-acme/blob/6955c94/pkg/controller/route/route.go#L694-L695 Although the dump suggests otherwise:
creationTimestamp: "2020-12-04T10:06:24Z"
name: exposer-1fv03q7jublbj4a1i50q53aub2g21brui5hbegkfofsphn7pp7h0
path: /.well-known/acme-challenge/o6IT_Jq0C3TTaCdtpbyVU_ce1dhDQgZrai7_uCGyVa0
---
creationTimestamp: "2020-12-04T10:06:25Z"
name: exposer-gcl2qecrlhckaicnst3fn1e4cne88op30m2fvtpjoc74ve60jlqg
path: /.well-known/acme-challenge/o6IT_Jq0C3TTaCdtpbyVU_ce1dhDQgZrai7_uCGyVa0
Including challengePath
would seem like an easy way to avoid it but I'd like to see the controller logs from around the time these 2 exposer routes were created to see order.URI
, authzURL
and challenge.URI
values if possible.
/priority important-soon /assign
Hi, Today i have seen you have pushed today at 13.00 middle european Time a new image to quay I tried it again and i have seen, that the challenge route is now correct created. But still no success at the moment with it
Here is the output of an openshift-acme pod:
openshift-acme-pod-output.txt 10min_openshift-acme-pod-output.txt
Can confirm success on OCP 4.6.
have you reinstalled the whole openshift acme app, with service account, roles, ... ? Is the route for the renew in the same namespace in your setup? Thx
@ggrames. The PR only touches the Controller code, not exposer. So if you used the example Deployment
with pull: Always then force a redeploy of the controller, that should be enough.
If you had validation code pending, not sure if the new Controller will pick that up or if you should recreate the Route.
I tested cluster-wide setup, where the acme controller is in its own namespace.
@tux-o-matic i have already recreated the route and yes i use pull always. Also i have already scaled the openshift-acme pods to 0 und up to 2 again. So it should be up2date There have to be another problem.
What happened: Certificate fails to get provisioned because controller creates and delete new exposer pods after a new route is added with kubernetes.io/tls-acme=true annotation.
What you expected to happen: A valid ACME certificate should be assigned to the route (live environment is deployed). Also, the exposer pod should be deleted after serving http challenge.
How to reproduce it (as minimally and precisely as possible): Install cluster-wide live controller. Deploying a Sample Application (I tried with other routes and apps as well): oc new-project example-project oc create -fhttps://raw.githubusercontent.com/tnozicka/gohellouniverse/master/deploy/{deployment,service}.yaml oc create route edge gohellouniverse --service=gohellouniverse oc patch route gohellouniverse -p '{"metadata":{"annotations":{"kubernetes.io/tls-acme":"true"}}}'
Anything else we need to know?: here's a part of controller logs (endless repeat):
The temporary route created for http challenge is responsive and returns the secret:
Environment (OKD 4.4 on bare metal): Client Version: 4.4.0-0.okd-2020-05-23-055148-beta5 Server Version: 4.4.0-0.okd-2020-05-23-055148-beta5 Kubernetes Version: v1.17.1