Open carsonoid opened 4 years ago
I've submitted a PR with my proposed code change to implement this here: https://github.com/argoproj/argo-cd/pull/4273
The code there is basically a general-purpose implementation of what we are already doing in our installation.
@carsonoid we don't think this will achieve that much other than delay the eventual refresh that anyways happens for all apps (by default 3 minute intervals). In other words, the skipping would only take affect for the annotated app, but within a few minutes all apps get refreshed anyways.
@jessesuen that's fair. We had to increase that interval quite a bit internally, But when we did, paired with sync-skipping options I've outlined here, we saw a 10x increase in the time to sync apps in our large installation. We can't be the only ones to be hitting this issue either. Large installations need a way to avoid excessive, unnecessary syncs.
We plan on improving repo server performance by paralyzing simultaneous queries against a single repo URL. Currently there is a mutex protecting a single repoURL (per repo server). I hope with that improvement, annotations won't be necessary
@carsonoid to workaround your issue, have you tried bumping up the number of repo-servers replicas to accommodate the simultaneous requests for a single mono-repo?
Sure did! We scaled up the number of repo servers and tweaked the app-controller as suggested in the HA docs. It had almost no impact on our performance.
The only thing that actually improved performance was running a fork of the api server which did something much like the proposed feature and dramatically reduced the number of pointless refreshes being done by the controller. The controller really seemed to be the bottleneck as it was basically doing app refreshes non-stop until we fixed the webhook.
:+1: for this feature request. We have the same problem with one repo for about 600 services in six k8s clusters and have noticed a lot of pointless refreshes of apps across all our k8s clusters when only one app changes in one cluster. This would be a HUGE benefit for us.
This is basically the same request as #3890
This is basically the same request as #3890
Except I have a more in-depth proposal and actual code for the feature tested and ready to go :)
Also 👍 , we are using a single repo that defines the whole infrastructure and it'd be a really good perf improvement. As the deploy process updates the infra repo with the new app image, all application will refresh and resync (if autosync enabled) if the deployment changes, even it it was only one webapp.
Added one suggestion to PR https://github.com/argoproj/argo-cd/pull/4273#issuecomment-696440801 . it was also discussed it with @jessesuen offline. @carsonoid , let me know what you think please.
I can see #4699 was merged and released in v1.8
, however I'm a little uncertain about the behaviors.
How does this affect the semantics of failed syncs - when the application source is filtered, the commit-SHA is still the last commit in the repo, not the last commit for the application or is it? Documentation says:
Automatic sync will not reattempt a sync if the previous sync attempt against the same commit-SHA and parameters had failed.
--app-resync=1800
as the docs say. We have had great performance with these settings and a very large repo for all apps.argocd.argoproj.io/manifest-generate-paths
filtering. This means that failed syncs will either retrigger when the next git poll happens after any commit to the repo or when a change is pushed to the target path in the repo.Disclaimer: I am not on the Argo-CD team. This is all from my personal experience writing and using this one feature and Argo-CD a lot.
Summary
Webhooks for git changes should only trigger resync for apps when the app files have actually changed.
Motivation
Right now every time the apiserver process a webhook it will trigger a resync for every app that uses that repo if the
ref
has changed since the since last sync, regardless of whether or not any files that the app uses in that repo have changed. ReferenceFor small use-cases this is fine, but we have a large installation with a single repo containing over 600 apps. That same repo also contains the app manifests themselves What this means is that when we push changes for a single app to our repo, ~599 excessive syncs are triggered. This was slowing down our argo performance significantly for no benefit.
We created a fork of the apiserver and wrote some custom logic in to skip app syncs when they were not needed and it improved our Argo performance at our scale by an order of magnitude.
One current workaround is to put all
Application
manifests in a dedicated repo, but that shouldn't be required. And still has the same issue of excessive refreshes if you have a large number ofApplication
manifests that are constantly changing (like we do).Proposal
Most of the major webhook formats include lists of files that have changed. The apiserver should be able to intelligently trigger selective resyncs based on the list of changed files in the webhook.
However: Because
kustomize
can do tricky things with directory traversal we cannot simply check for changed files under the.spec.source.path
by default. Because it's possible to not trigger updates properly forkustomize
based apps.I propose two new, optional annotations on
argoproj.io/Application
resources:argocd.argoproj.io/refresh-on-path-updates-only: "true|false"
true
then it will use.spec.source.path
to see if any of the files under that path have been changed according to the webhook. If so, it will trigger the sync like always. if not, it will skip the refresh for the app.argocd.argoproj.io/refresh-prefix: "PATHSTRING"
This means that all existing installations are unchanged and anyone not doing extensive
kustomize
backwards directory traversal (../../../file
) can simply setargocd.argoproj.io/refresh-on-path-updates-only: "true"
on all their apps and get an instant performance boost from the reduced refreshes!Alternatively, even complicated
kustomize
based installations could useargocd.argoproj.io/refresh-prefix
to get the same benefit, albeit with more work.