argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.74k stars 5.41k forks source link

Application dependencies #7437

Open jessesuen opened 3 years ago

jessesuen commented 3 years ago

Summary

I was speaking with @jasonmorgan from Buoyant today about a missing feature in Argo CD for blocking application syncs based on required dependencies on other applications. The use case is:

  1. I need to deploy apps A and B
  2. B must not be deployed before A (because A has a mutating webhook which must be in place before B starts)
  3. I want to sync them all at the same time and don't want to think about clicking sync in some correct order

This is especially important for the bootstrapping use case where you're recreating a cluster from git, and you need to create many apps after a bunch of system-level add-ons are fully available. e.g. linkerd must be in place before any applications come up, because linkerd's mutating webhook needs to inject sidecars into application pods starting up.

The use case is very compelling and I'm convinced we should prioritize this. I think this feature, combined with ApplicationSets will really start to complete our bootstrapping story.

Motivation

Please give examples of your use case, e.g. when would you use this.

During cluster bootstrapping, cluster addons (especially ones with mutating webhooks) need to be in place before application pods can come up.

Proposal

How do you think this should be implemented?

It turns out, @jannfis already started some work on this, and the spec changes close to what we need: https://github.com/argoproj/argo-cd/pull/3892

Given the age of the original PR, I'm filing an issue in case we abandon https://github.com/argoproj/argo-cd/pull/3892 for a new attempt, and targeting this for tentative next milestone in case someone wants to pick this up.

jannfis commented 2 years ago

I'm glad to see this gaining traction again. From previous discussions, we thought that the sync retry feature would solve this problem in a more declarative way (e.g. reconcile as long as necessary, hoping for dependencies to have finished reconciling in a certain time frame).

I think we could build up upon the existing PoC code, however I think we should consider some more things than are currently implemented in the PoC:

And probably some more things I have somewhere in the back of my mind from when I came up with the PoC.

jessesuen commented 2 years ago

I'm glad to see this gaining traction again. From previous discussions, we thought that the sync retry feature would solve this problem in a more declarative way (e.g. reconcile as long as necessary, hoping for dependencies to have finished reconciling in a certain time frame).

Yes, what I now realize is that retries don't help because in the problematic scenario (mutating webhooks), nothing actually "fails" per se and so there is nothing to retry. The dependent application silently succeeds even though it didn't get injected properly.

I think we could build up upon the existing PoC code, however, I think we should consider some more things than are currently implemented in the PoC:

I love your ideas on making this even more powerful with labels and force sync. But for MVP, we can keep this quite simple, not very far removed from your PoC. The way I think this feature should work is:

  1. Application B depends on A. Both applications are created, but neither is deployed (have a Missing health status).
  2. User clicks sync on B
  3. B now has an operation in a Running state (because we don't have a Pending state), but stays inRunning indefinitely because A is not healthy (NOTE: we would also keep it in Running if A did not exist).
  4. User eventually clicks on sync on A
  5. As soon as A is Healthy, B would actually go through with the operation.

I took a look at your work, and I believe you implemented it just like how I described it.

Dependencies should be visualized in the UI, similar to how we visualize ownerReferences

I think this is more than we need, a simple message in the operation would be sufficient to understand what's going on.

Lavanya-Anbalagan commented 2 years ago

This is a blocker for us and makes us to put lot of efforts between the dependency applications. Can we get an update on this ?.

flaviomoringa commented 2 years ago

Have the exact same issue with installing Kyverno and then some policies. Also referenced here: https://github.com/argoproj/argo-cd/issues/8358 https://github.com/argoproj/argo-cd/issues/7978

hhannani commented 2 years ago

Hi team, is there a way to use dependencies between yaml files within the same Application?

DotNetRockStar commented 2 years ago

bump; same issues.

rafilkmp3 commented 2 years ago

bump; same issues.

christianh814 commented 2 years ago

Just adding my "bump" here. This is mainly because I would also like this with ApplicationSets as I stated in issue #221

wmgroot commented 2 years ago

I've opened a PR showing a possible implementation path (which needs some work). This is against the old repo, but I'd like to get feedback on the direction before investing more effort into migrating it to this repo. https://github.com/wmgroot/applicationset/pull/1

If the dependency work is close to completion, I believe it could replace the user defined rollout stages in my PR.

qxmips commented 2 years ago

same here

nneram commented 2 years ago

We would love to see this feature as well ! 👍🏻

rumstead commented 2 years ago

Adding my "bump".

EDIT: Use cases:

  1. Namespaces/Namespace quotas (cluster bootstrap)
  2. Vault (mutating webhook)
  3. Service mesh (Consul with a mutating webhook)
  4. Capsule (multi-tenancy enabler)
  5. Business applications
chenele commented 2 years ago

Adding my bump

crenshaw-dev commented 2 years ago

Thanks for the +1s! If you leave a comment, please add info about your use case so it can be considered when writing the feature. Otherwise adding a thumbs-up to the issue is sufficient to move it up the priorities list. :-)

imusmanmalik commented 2 years ago

+1 would love to see this feature as well

Also have this requirement of Apps based on Apps and so on... same use-case Application B depends on A.

dgsardina commented 2 years ago

+1

My use case will be on a cluster bootstrap we have istiod and istio-ingressgateway deployed as independent applications but the latter fails to sync as the mutating webhook of the first was not ready when it was deployed.

RobCannon commented 2 years ago

My use cases are: I have an Application that references a folder-based chart that has our Certificate declarations. That Application will fail unless the Application that installs the cert-manager helm chart has succeeded (even if I install the CRDs first). I would also like to make the Applications that deploy our app services dependent on the certificates Application.

I can use sync waves and App of App hierarchies to get everything to deploy in the right order when I bootstrap a cluster, but just having a property on the Application that says it is dependent on one or more other Applications seems MUCH easier to manage. Let ArgoCD figure out the order based on the dependency info!

RobCannon commented 2 years ago

It looks like this is being tracked on the roadmap in this issue. Please go upvote! https://github.com/argoproj/argo-cd/issues/3517

day0hero commented 2 years ago

I would really like to see this feature added! We are using jobs with sync-waves/hooks to get this functionality. While it works, it can be cumbersome to implement/debug especially when you're putting these hooks in across 10+ applications. Having the ability to clearly define the dependencies between the applications would be awesome!

Just as an example of our deployment scenario (there are other components to this, but the flow is the similar):

  1. deploy cloud storage (openshift data foundation)
  2. kubernetes job that waits for the storage to become available
  3. deploy dependent resources (quay, objectstoreuser (for s3 integration)
  4. kubernetes job that waits for the user and secret to get auto-generated
  5. deploy remaining applications
blakepettersson commented 2 years ago

The use case that'd be interesting for us would be the inverse; i.e to ensure that certain apps gets deleted last. For example with Karpenter, we'd want to say that all apps depend on the Karpenter app, and ensure that the dependent apps gets deleted first, allowing Karpenter to delete the nodes which the other apps were previously using before removing Karpenter itself.

We'd also want the same thing for aws-load-balancer-controller in order to ensure that the ingresses (and their attached ALBs) for all dependent applications are removed before deleting the lb-controller.

This also becomes more relevant when using ApplicationSets, since AFAIK there's currently no way to set the order (with Sync Waves or otherwise) on ApplicationSets.

joaofhenriques commented 2 years ago

just another bump here, as this would greatly improve our IaC deployment strategy

sidineyc commented 2 years ago

same here, looking forward to this feature.

srinath-panda commented 2 years ago

same here similar to istio, but in my case its linkerd and its other dependent componenets that has mutating webhooks

fredleger commented 2 years ago

same here

use case : creating letsencrypt cluster issuer in the same repo as cert-manager get installed. There is way to workaround that but this could be better and more readable if handled by argocd application

TimVerbois commented 2 years ago

Same issue here, I need some way of order to get everything deployed to speed up the process.

So, looking forward to this feature.

blakepettersson commented 2 years ago

How will this interact with #10432? Does this topic need to take that into account?

zerodayyy commented 2 years ago

+1

Use case: pretty much any operator deployment, e.g. Redis managed via Redis Operator

  1. User deletes both Redis app and Redis Operator app simultaneously (for instance, via an ApplicationSet)
  2. Randomly, Operator gets deleted first
  3. Redis app resources can't be removed now since they contain a finalizer from a now deleted Operator

As a result, system is left in a state where manual intervention is required to delete the leftovers

shanproofpoint commented 2 years ago

I have had to create app of apps to break argocd deadlocks on crd and resource reconciliation occurring at the same time. argocd just gets stuck never resolving.

corinz commented 1 year ago

Most of the comments I see are people like myself looking for basic dependency specification among their apps/charts. The docs aren't clear on sync waves and hooks. Most people think they can assign sync waves and hooks to Applications. Seems pretty logical, but to my knowledge there is no way to order your Application level syncing

What's the status on this? Can we get a comment from a maintainer? There's at least 3 related issues, and a couple proposals here, and the mess is bubbling over.

Let me know how I may be of assistance.

corinz commented 1 year ago

Also, for the people asking these questions...

  1. What are we missing?
  2. What are some work arounds? @blakepettersson @crenshaw-dev
shanproofpoint commented 1 year ago

Also, for the people asking these questions...

  1. What are we missing?
  2. What are some work arounds? @blakepettersson @crenshaw-dev

dependency requirements i believe are ultimately about how to avoid deadlocks of resources waiting on each other to finish. i think the eventual consistency that people argue for opposes any sort of ordering. but in reality to have a clean install ordering is required. the solution here is to make sure all apps are truely independent and will retry themselves until all the definitions they rely on are in memory. in an effort to also use sync waves at different levels, i have inadvertantly made some superficial deadlocks. did you try completely removing sync-wave everywhere?

shanproofpoint commented 1 year ago

i would also imagine not specifying sync-wave does not remove the app from the default ordering and when app of apps is applied may result in ambiguous ordering

wmgroot commented 1 year ago

I have an open PR that I believe addresses most of the use cases for ordering the deployment of Applications. Using an ApplicationSet you can set an order using label selectors of the Application resources created by the AppSet.

Please have a look and let me know if the proposal will satisfy most of your needs. https://github.com/argoproj/argo-cd/pull/10048

TimVerbois commented 1 year ago

That would really help our setup, most of our running applications have multiple Application definitions in it (e.g. helmchart & namespace). Grouping them together would make life much easier.

Mobiel verstuurd

Op 5 nov. 2022 om 00:09 heeft wmgroot @.***> het volgende geschreven:

 I have an open PR that I believe addresses most of the use cases for ordering the deployment of Applications. Using an ApplicationSet you can set an order using label selectors of the Application resources created by the AppSet.

Please have a look and let me know if the proposal will satisfy most of your needs.

10048

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

blakepettersson commented 1 year ago

What's the status on this? Can we get a comment from a maintainer? There's at least 3 related issues, and a couple proposals here, and the mess is bubbling over.

IMO I don't think this is something that would make sense to tackle until #10432 is released (so Argo CD post-2.6ish?). At least after that's merged I'd be interested in helping out with this issue.

I have an open PR that I believe addresses most of the use cases for ordering the deployment of Applications. Using an ApplicationSet you can set an order using label selectors of the Application resources created by the AppSet.

Please have a look and let me know if the proposal will satisfy most of your needs.

I love this proposal @wmgroot! In its current state it does not address the use case that I have in mind though, i.e. being able to remove dependent applications(ets) when de-provisioning a cluster and/or removing a particular Application. Is that something that could be added to your proposal, or does it even make sense to do it on that level (I suspect that it might not)?

Tanemahuta commented 1 year ago

Hello there.

I'd like to step in here by adding my two cents regarding the problem space and discussing a possible solution.

Kindly note:

My assumptions are based on what I've derived by reading the docs and parts of the code, completed by the requirements which resulted in using argo-cd in more complex use-cases.

Please correct me in case any assumption is erroneous.

Problem space:

Solution space:

blakepettersson commented 1 year ago

Hi,

I promised that I would reply here on Slack a while back, sorry about taking a bit longer than expected. There's a lot to take in, so I'll attempt to comment on the specific bits which IMO merit more discussion. I'll also add a disclaimer that my understanding of both the Argo CD code base and your proposal might be limited here, so take my comments with a grain of salt 😄

  • no inter-resource dependencies within an application (=> atomic)

I'm not sure what you mean by this, can you clarify? I'd argue that an application at the very least has a dependency on the repo or Helm chart which it needs to deploy, but all this depends on what we mean by dependencies. Once #10432 is a thing, an application will potentially have multiple sources which it depends on.

  • add a dependencies to spec of the CR Application which references other applications by their names

I'd say it's more clear with dependsOn or something similar, otherwise :thumbsup:

  • enhance Application's status:

  • introduce a new SyncStatusCode : WaitForDependencies

I'd say it's more clear with WaitingForDependencies or WaitingOnDependencies, otherwise :thumbsup:

  • add sync.dependencies which is a map[string]SyncStatusCode

  • modifications to the Application controller:

    • skip synchronization if referenced Applications cannot be found or are not in Synced state, set

    • status.sync.code to WaitForDependencies

      • status.sync.dependencies using the dependency's SyncStatusCode
      • successful reconciliations of an Application trigger a sync to all Applications which reference the reconciled Application

This part makes sense to me at least. What's missing for me (since I'm not super familiar with the code) is the specifics on how a sync to an Application will trigger a sync to all downstream Applications. Since a user wouldn't be declaring that their Application has a dependency on the parent Application itself, (I presume) once a user declares it has a dependency on another Application (when said application is created or updated), we would also need to somehow update the parent Application to say that it, in turn, has a downstream dependency (perhaps by having another property on the ApplicationSpec, something like dependedBy or dependedOn?).

There may be already some way that we could achieve this without having to setup two-way bindings on an Application, or without having to loop through all applications (since that would likely lead to a bad time when iterating through a large amount of applications). Someone more knowledgeable would need to chime in here though.

  • values propagation

  • define new CR ApplicationOutput which is to deployed in the helm chart or kustomization manifests:

    • spec:
    • outputs : map[string]PropertySource whereas key is the output name, and PropertySource contains

      • name of resource to read the value from

      • propertyPath of the output within the resource

      • status:

        • properties as raw JSON output (runtime.RawExtension)
        • extend the Application's spec.dependencies to be used as map[string]DependencySpec whereas key denotes the app name and DependencySpec contains
      • properties as []PropertyPropagation with

        • sourceName of the ApplicationOutput
        • targetName of the helm chart value or vars in kustomize
      • modify the Application controller to propagate the Application's spec.dependencies's properties to helm/kustomize

If I understand correctly, this part of the proposal with the values propagation is to extend Applications so that dependent Applications would be able to consume output values from the Application which it depends on. In my opinion this is a bit out of scope, and if we would want to have this, this should be punted for a later iteration. I'm personally not convinced that this is something we should have at all (others may disagree though!); IMO it should suffice for an Application to only depend on the sync status of its parent Application(s).

Tanemahuta commented 1 year ago
blakepettersson commented 1 year ago

There is a way to find applications being dependent on the reconciled application client.List can be used in conjunction with client.MatchingFields (see https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/client#MatchingFields)

Makes sense to me at least. A follow-up question from that is what happens when a parent Application gets deleted, and/or if an Application specifies a dependency which does not exist. I suspect we might need another SyncStatusCode indicating that a dependency has an error.

Tanemahuta commented 1 year ago

Cool. Would it be bad if we would un-deploy the dependent Applications and set the state back to WaitingOnDependencies? - I mean this is the same state the Application is set to if a dependsOn Application is not present or synchronized.

boedy commented 1 year ago

I have had to create app of apps to break argocd deadlocks on crd and resource reconciliation occurring at the same time. argocd just gets stuck never resolving.

If this feature gets implemented (which I really hope it does 🚀 ), we should consider what should happen in case of (accidental / unintended) circular dependencies:

App A --(dependsOn)--> App B --(dependsOn)--> App A
or
App A --(dependsOn)--> App B --(dependsOn)--> App C --(dependsOn)--> App A

Also since nobody has yet mentioned it. FluxCD has a similar feature. Could be used to take some inspiration from.

shanproofpoint commented 1 year ago

in case this is useful to anyone running into resources getting stuck because crds not yet available. you have to put this on the resource:

metadata:
  name: dontcare
  annotations:
    argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
shanproofpoint commented 1 year ago

I have had to create app of apps to break argocd deadlocks on crd and resource reconciliation occurring at the same time. argocd just gets stuck never resolving.

If this feature gets implemented (which I really hope it does 🚀 ), we should consider what should happen in case of (accidental / unintended) circular dependencies:

App A --(dependsOn)--> App B --(dependsOn)--> App A
or
App A --(dependsOn)--> App B --(dependsOn)--> App C --(dependsOn)--> App A

Also since nobody has yet mentioned it. FluxCD has a similar feature. Could be used to take some inspiration from.

For sure this is necessary. I would imagine the implementation includes a hashset of the apps already visited and would ignore the additional dependency

Tanemahuta commented 1 year ago

This is just a simple check if the dependency graph is circle-free. Just walk the dependencies and save the path (types.NamespacedName) . In case we visit an App which is already contained, this is a circular dependency.

chary1112004 commented 1 year ago

Hi, here is our use case:

Expect:

Could anyone let us know about the possible plan of this request could be released?

Thanks!

jaxels10 commented 1 year ago

We need this for deploying certain applications before others, such as kyverno with kyverno policies, but also having Ceph fully reconciled before letting applications use its storage classes. This is the number one missing features keeping us from using Argo and instead using Flux. If this was implemented I am sure we would make the switch.

sambonbonne commented 1 year ago

@jaxels10 maybe you already considered this option but why not using sync waves if you "just" want to be sure with the apply order?

See the documentation for more information.

If your applications are centralized in one repository, with the apps of apps pattern, you can use sync waves to ensure apply order.

fvogl commented 1 year ago

@sambonbonne unfortunately sync-waves don't work for app-of-apps in case of updates. The sync order is working for the initial deployment of the apps and also while deleting them (Argo takes them out in the descending order). For updates though the order is random and basically most of the changes are applied at the same time. I would love to see the sync-waves working.

purduemike commented 1 year ago

the solution here is to make sure all apps are truly independent and will retry themselves until all the definitions they rely on are in memory.

I tend to agree with @shanproofpoint. We should try to make sure apps are independent. My use-case is to ensure our DB schema changes are live before starting App B. This can easily be done in code. App B, just need to check the schema version in the DB before making its health check green. The problem with sync-waves between apps is, how should apps behave if it updates don't finish before the next sync? I feel like this can get really complex really quickly. So, shooting for app independence is key.

jannfis commented 1 year ago

I took a new throw at implementing this. I diverted a little from the previous approach, but I think it's pretty usable already: https://github.com/argoproj/argo-cd/pull/15280