fluxcd / flux2

Open and extensible continuous delivery solution for Kubernetes. Powered by GitOps Toolkit.
https://fluxcd.io
Apache License 2.0
6.4k stars 594 forks source link

Flux cli reconcile force if suspended #959

Open PhilippMT opened 3 years ago

PhilippMT commented 3 years ago

Hi all,

I did not find an other ticket discussing this feature. We have a kustomization resource for some infrastructure applications which we do not want to be automatically reconciled. Thats why the resource is suspended. After applying changes to the deployments of the infrastructure components we manually trigger a reconcilliation via the flux cli by resuming the kustumization and setting it back into suspended mode afterwards. In my opinion this is not a straightforward approach. I would like to be able to run a single reconciliation via the cli with a kind of --force-if-suspended flag like flux reconcile kustomization infrastructure --force-if-suspended --with-source. This would be a more straightforward solution.

kingdonb commented 3 years ago

Thank you for your report! I made some notes in #1004 about this, in the discussions portal.

Basically my objection to this feature would be that with suspend in the spec, currently reconciliations are not processed and this is auditable and can be proven via git history – if you are deploying from a protected branch, and a reconciliation is suspended in git history, that is a record which is trustworthy and can show that deployments were not processed between Time X (when suspend was merged into the spec) and Time Y (when the suspend: true spec field was reverted.)

By adding an override that goes around the git repository, you lose this capability of audit-friendliness. Suspend can also be done in-cluster, which kinda countermands my argument already, but not really (I'm about 99% sure that if a reconciliation has suspend: false in the git repository and suspend: true in the in-cluster representation itself, that the gitrepo will win.)

This does not solve your issue, I'm just trying to be helpful and re-focus the discussion, possibly move to a more appropriate venue. Since this is a feature proposal, it belongs in discussion. Issue reports are reserved for "Something is not working as intended / I found a bug" according to the support page – this is intended to help avoid that the Issues listing becomes a giant, irredeemable wasteland of feature requests that stay open forever or until they are implemented/forgotten about.

therapy-lf commented 3 years ago

I think we brought something similar in https://github.com/fluxcd/flux2/discussions/870. My few cents - it'd be great to have something like this https://docs.flagger.app/usage/webhooks#manual-gating. So, let's say to have an option either use automated way or manual and confirm reconciliation via pre-rollout hooks. Do you have any internal discussions around this feature?

kingdonb commented 3 years ago

Manual gating is an answer, and in fact we had an internal discussion where this report came up (it is a special case of a pretty common request), this was a suggestion for how to solve it that was raised – Flagger's manual gating is a great way to prevent an individual deployment from proceeding until someone takes action to release it. Flagger's manual gate is not such a great answer if what you wanted is to stop all deployment updates on the cluster until they are manually released.

There is currently no way to stop all updates until the "reconcile-everything" gate is called upon (or to stop all updates until the maintenance window begins.) You could set up a CronJob that flips the spec.suspend to false on all GitRepository sources at a pre-appointed time, and flip it back to true when the window has passed. It could make sense to add a new policy controller that enforces this at a cluster level if that's the granularity that you wanted. Maybe only certain Kustomizations or HelmRelease reconciliations should be suspended, and others should be allowed whenever.

From #870 I understand this as a new proposal for a feature that could be added, not a currently available approach to solve the requirement. Thanks for adding that to the discussion! I personally haven't read that proposal before now.

stefanprodan commented 3 years ago

Please see this proposal: https://github.com/fluxcd/flux2/discussions/870#discussioncomment-420896