Closed jdziedzic closed 2 years ago
So at the moment the timeout is not configurable.
No one ever had an issue with the defaults. So while we should perhaps make the timeout configurable, you should also investigate why your api server is so slow.
this is not an option
I would like to better understand what happens in this situation cert-utils will set the secret how it's configured to do, but if they are already correctly set, it should not do anything. So perhaps you found a bug there, can you supply a reproducer?
I've narrowed it down to what specifically is changing each time.
This particular cert is taking advantage of the feature cert-utils-operator.redhat-cop.io/generate-java-keystores: 'true'
It appears that on every restart, the keystore.jks value is being re-generated which causes not only that value to change, but the resourceVersion value as well. This is why reloader is restarting the pods. To recreate this you would just need a cert secret with the additional annotations:
cert-utils-operator.redhat-cop.io/java-keystore-password: password
cert-utils-operator.redhat-cop.io/generate-java-keystores: 'true'
Then restart the cert-utils pod.
ok, I looked at the code and I believe I understand where on create it updates the object even if nothing is changed. If I change the update to server side patch, this particular issue should go away. As a workaround, are you aware that cert-manager can generate truststores and keystores directly?
can you test if this branch fixes your issue: https://github.com/redhat-cop/cert-utils-operator/tree/fix%23110
As a workaround, are you aware that cert-manager can generate truststores and keystores directly?
We're quite a bit behind on cert-manager versions as we can't migrate to a new version until all of our app teams update their certificate specs to the new apiVersion. Do you know what version this is available in?
can you test if this branch fixes your issue: https://github.com/redhat-cop/cert-utils-operator/tree/fix%23110
Would it be possible to push an image to quay.io that I could update the CSV to, to test?
I tried to run through the build, but all of our internal firewall and security mechanisms make it impossible to run the makefile successfully. Any help in building the image and publishing to quay.io would be greatly appreciated. Then I can test.
I did some more investigation on this problem. When you calculate a keystore starting from PEM-formatted keys, you get a different value every time (maybe there is timestamp somewhere in it). So the operator behaves correctly as it sees that it needs to update those values... So the solution is not that simple. I need some time to figure out a clean way to solve this. In the mean time, disregard the PR, it's not correct.
We are seeing an issue where our manager containers restarts due to slow API response:
This creates a downstream problem for us as cert-utils reconciles resources on startup and makes changes to the secrets which causes reloader (https://github.com/stakater/Reloader) to think the cert has changed, when it hasn't. This causes mass app pod restarts.
By default, the operator runs in a single instance so there really is no need for a leader-election process.
Would it be possible to do one of the following: