argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.54k stars 5.35k forks source link

Check if `source.path` is changed before regenerating the manifests on push events #3890

Open smitthakkar96 opened 4 years ago

smitthakkar96 commented 4 years ago

Summary

We need to check the diff between the HEAD and HEAD~1 to see if the source path is changed, only if it is changed we generate manifests otherwise skip it. We need to change the logic in the repo webhook handler.

Motivation

This is useful when you are putting all the application configs in same repository. Re-generating manifests for all applications might be too costly. In our case we use helmfiles and whenever we get a push it helmfile template which under the hood fetches the repo, decrypts the secrets and generates manifests. This becomes a problem if you have lot of applications in your cluster (Takes around 6 seconds to run). A workaround for time being to increase number of repo servers but throwing resources at a problem is not the best solution.

smitthakkar96 commented 4 years ago

I can work on this if community agrees that we should solve this problem

jannfis commented 4 years ago

Hi @smitthakkar96 , I can see the problem when you're sourcing all of your applications from a large mono repo (which brings lots of other challenges with it, too). But I'm unsure how your proposal could be implemented in a correct way.

What exact change have you thought of and where does the webhook handler comes into play? To my knowledge, the webhook handler triggers a refresh request for all applications that are sourced from the repository the webhook is triggered from. The application controller then performs a reconciliation with a refresh of the manifests from the repo server to see if anything has changed.

We should keep in mind that once the web hook is triggered on a push to the repository, the pushed changes could be spread over multiple commits (and also over multiple apps), so I think it's not sufficient to compare HEAD to HEAD~1 (unless I'm understanding HEAD~1 in a wrong way). How would you decide that a refresh for a given app is needed? Also, what if the change was made outside of source.path, i.e. when you use Helm value files from other paths in your repository (I think this is a very common pattern).

I'm happy to discuss this further, but we should take care not to break existing patterns.

jessesuen commented 4 years ago

Changes to source path alone is not sufficient to know if the final manifests are affected. For example, kustomizations which refer to bases outside of the path, will affect the resulting manifests. But with this feature, it would skip the re-rendering of manifests because it doesn't think it changed.

I agree that increasing repo-servers is a crude way of supporting higher throughput, but I think we can help address the throughput issue by supporting multiple clones of a repository within a single repo server (though I admit is just another form of throwing more resources at the problem).

smitthakkar96 commented 4 years ago

Can we add another property where you can provide the paths to watch with wildcards and argocd will respect that variable. We will keep default behavior only if paths to watch is specified we will change the behavior

secondly the request body of the push event has below attrs

  "before": "95790bf891e76fee5e1747ab589903a6a1f80f22",
  "after": "da1560886d4f094c3e6c9ef40349f7d38b5d27d7",

which we can use to get the diff and see what paths changed.

What do you think?

smitthakkar96 commented 4 years ago

@jessesuen @jannfis ping

anarcher commented 4 years ago

IMHO, Changes in source path (or paths) can be useful that many apps in one branch. Sometimes, some changes which do not belong to an app are displayed (synced) in the app. It's no issue for sync/deploy, but this change revision info is not useful in history/rollback (for seeing/tracking change information of the change). I think the source.path has more paths ( e.g. mainPath, trackingPaths).

ArgoCD repo-server clones git repo per app? or the repo-server have a queue for cloning? If the repo-server has a cloning queue, repo-server do clone once, apps can use this cloned?

smitthakkar96 commented 4 years ago

ping! guys it's been a month can we have a quick chat regarding this and come up with a solution. Would really appreciate some help

jannfis commented 4 years ago

Hi @smitthakkar96 - sorry for the late reply, we have not forgotten about this issue. Yes, performance for very large mono repos is bad currently, but I just wanted to let you know that fixing this is with a high priority on the roadmap

smitthakkar96 commented 4 years ago

I was just saying take out sometime let’s have a chat on the approach, I wanna help fixing this as we need this urgently.

On Wed, 19 Aug 2020 at 13:05 jannfis notifications@github.com wrote:

Hi @smitthakkar96 https://github.com/smitthakkar96 - sorry for the late reply, we have not forgotten about this issue. Yes, performance for very large mono repos is bad currently, but I just wanted to let you know that fixing this is with a high priority on the roadmap https://argoproj.github.io/argo-cd/roadmap/#performance

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argoproj/argo-cd/issues/3890#issuecomment-675992145, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN3OKV5PW4FNC4BP7HH2QTSBOIWFANCNFSM4OO7ZWZA .