Open grozan opened 1 year ago
I would love to see this feature. We implemented our applications in ArgoCD, where a -1
wave is used to execute migrations prior actually upgrading the application itself. We expected ArgoCD to stop the sync if any of them turned Degraded
, but this is apparently not the case.
Making it configurable like this, would be perfect. The naming "children" was a bit confusing to me, and that's why I did not find this issue right away. Other ways to call this would be, for example, "wait for healthy wave" (before continuing).
argocd.argoproj.io/sync-wave-wait-for-children
with this label I'd also suggest to make -1 possible, where it'll be "unlimited".
I would also like to see this feature. I am currently migrating a number of helm charts to argo using app of apps pattern. It's been a challenge to apply sync-waves to the more complex charts. It would make things a lot easier if I could just apply a sync-wave to an app and have that propagate to all the child resources.
I would also like to see the feature. I found a workaround for that, for everyone who is interested: https://blog.stderr.at/openshift/2023/03/operator-installation-with-argo-cd/
Summary
you use sync waves when you want to order and control a deployment. The creation of a custom resource is commonly picked up by a controller running in the cluster, triggering the creation of other object(s) It should be possible to instruct Argocd to wait for those child objects to be in a healthy state before considering the wave to be completed (just as if those objects also had the same
argocd.argoproj.io/sync-wave
annotation of their parent)Motivation
I want to deploy operators in a cluster. I want the deployment of the operator to be done in a first wave, and I want Argo to wait for the operator to be fully up and running before going to the next wave (during which the CR watched by this operator will be pushed)
For Openshift, this first wave typically contains 2 objects: an
OperatorGroup
and aSubscription
. The creation of those objects triggers the creation of aClusterServiceVersion
object, which triggers the real installation. ThatClusterServiceVersion
object is not in the git repo of course, and Argocd does not wait for it to report a healthy state (itsstatus.phase
field should beSucceeded
) before going to the next wave.Even if the installation of the operator fails, Argo will still go to the next wave I simulated this with a custom health check always reporting a failure
and Argo still happily starts the wave 2. I would want wave 1 to be considered failed, and the whole sync to immediately stop and report a failure
Proposal
Make it optionable, at an object-level, to say "consider the child objects too, and wait for them to also be healthy before continuing"
How do you think this should be implemented?
I could imagine an annotation could be introduced for that. It could be something like
argocd.argoproj.io/sync-wave-add-children: true
, or something. Or if there is a question of timing, like "how long does argocd wait to see if a child object is created by another controller or not", then it could be something likeargocd.argoproj.io/sync-wave-wait-for-children: 4
to wait up to 4 seconds (default value would be 0 of course, to mimic the current behaviour)