Open bwplotka opened 5 years ago
Happy new year! :tada:
I think this sort of thing would be great to add to our documentation - or at least notes summarising what you've put above, so that users can understand what they need to do and what their options are 😄
/kind documentation
This is one of those subtle issues that isn't apparent from reading the intro docs, and will cause a full outage when it bites you. I think it's worth at least calling out as a "here be dragons" kind of message; whatever your chosen solution, if you haven't picked one, then you are probably going to have an outage at some point (usually coinciding with when your team is all on vacation, since that's when the code/deploy velocity will have dropped off).
(Not being overly-specific because this is exactly what happened to me or anything like that...) :)
I don't get @paultiplady what is the actual outcome of your comment (: Are you just ranting about fact that nothing works for 100%? Sure but can we focus on fixing this issue, to recommend or explain solution that will be closer to 100% than others?
I'm adding a user use-case emphasizing that this is important to document, as it produces outages if it's not handled.
So I just found this https://github.com/pusher/wave. It will watch for changes on configmaps and secrets for deployments and perform a rolling deploy when they get updated. So to go off of the example from the initial issue the following would happen:
Nice, if that is production rdy then it looks really promising!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to jetstack.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to jetstack.
/close
@retest-bot: Closing this issue.
Currently have to implement a solution for this as well and saw the recommendation for Wave above. I also ran into https://github.com/stakater/Reloader which does the same things but has more stars and looks easier to install.
/reopen /remove-lifecycle rotten /lifecycle frozen
@munnerz: Reopened this issue.
Hi and Happy New Year All!
Thanks for great product. We use it on production for long time, but we want to focus to improve automation and avoid manual intervention during certificate renewal for our services. How to ensure Pod's server will actually reload certificate? Particuraly:
It's definitely not cert-manager issue, but it would be nice for cert-manager to incldue potential solutions to this problem as best practices.
There are multiple options like: A) Ensure application can reload it "hitless"/non-distruptive. E.g you can implement that for Golang HTTP server, or hope that your service you use allows that (mostly they don't). For example envoy recently added that option: https://github.com/envoyproxy/envoy/issues/1194 B) Some generic cert-rotate operator that will rolling restart stateless deployments to load new certificates? Maybe logic like this in
cert-manager
makes sense? C) Have your rollout tools handle that? (ensure pods are restarted frequently)What is common way of solving this problem? I guess
A
for less distruptive rotation possible, but what if it's 3rd party tool that does not support hot reload? I have searched gh issues, but haven't found relevant response.Do you agree that some docs for best practices for this would be suitable in cert-manager documention?
Environment details (if applicable):
/kind feature