Open ciiay opened 1 year ago
Full conversation about this issue: https://cloud-native.slack.com/archives/C01TSERG0KZ/p1690321766622059?thread_ts=1664885597.178089&cid=C01TSERG0KZ
Another symptom but most likely the same root case, but after enabling progressive sync on an instance (2.8.0-rc5) with only 5 applicationSet, our ApplicationSet controller CPU went to the roof and I could see in the logs Application <APP> is already synced and healthy, updating its ApplicationSet status to Healthy
2700x per hour for every Application.
@wmgroot any guesses? :-)
This is very similar to what we are facing in https://github.com/argoproj/argo-cd/issues/12878 @crenshaw-dev The fix https://github.com/argoproj/argo-cd/issues/12878#issuecomment-1642257603 did not help with the issue.
Yeah, this isn't a normalization issue... My very rough guess is that some field in here is churning: https://github.com/argoproj/argo-cd/blob/c721592d210383dadcf0bf0dfcfce9c7a1794162/applicationset/controllers/applicationset_controller.go#L1408-L1417
Maybe the app health is flapping, or maybe the ApplicationSet controller is erroneously re-triggering sync operations, bumping the status.operationState.startedAt
value each time.
Also confirming all @ciiay observations. Even with progressiveSyncs
disabled git is hit massively by applicationSet controller but not as massively as with it enabled.
Logs we observe are "received update event from owning an application"
the requeue
and the unknown Application
Last time I looked into this I noticed one of the status fields was flipping constantly, it seemed like two different status entries were fighting over the same key. I did look into how the status field was being set, but I didn’t see a clear issue from the progressive sync code.
two different status entries were fighting over the same key
Can you clarify what "status entries" means here?
Even with progressiveSyncs disabled git is hit massively by applicationSet controller
@stafot I think you're facing a different issue. Are you running a version that includes this fix?
Yes we are observing this behaviour after updating to the latest helm chart of Argo CD which afaiu contains the above-mentioned fix.
Gotcha. So separate issue, likely unrelated to progressive syncs.
@crenshaw-dev We are observing exactly the same behaviour as this ticket of @ciiay describes. Our git getting hammered by applicationSet
controller with enabled progressive syncs
the hammering is linearly increasing and with them disabled it gets stabilized but in a really higher level than our reconciliation activity baseline before 2.6.3
. When we are moving to any more recent version than 2.6.2
we are experiencing this hammering effect. So may or may not related somehow with progressive syncs, but for sure is related to applicationSet controller from 2.6.3
version and ahead.
I'm kind of lost between https://github.com/argoproj/argo-cd/issues/12878 and this issue.
We are using appsets (without progressive sync), and after upgrading from v2.7.3 to 2.7.10 we are seeing the following spike:
To start narrowing down the issue(s), I think we're dying ru need full details, i.e. ApplicationSet specs and logs.
It's also possible that the spike is due to an unrelated issue, since the application controller also triggers checkouts.
@stafot #12612 seems like the most likely suspect in your case. Unless you're using multi-source apps, in which case I noticed another possibly suspicious commit.
@stafot #12612 seems like the most likely suspect in your case. Unless you're using multi-source apps, in which case I noticed another possibly suspicious commit.
@crenshaw-dev We’re using muti-source apps
Then #12379 might be involved.
To make progress, I think we have to treat these all as separate issues and open new, fully-described issues for each. If things turn out to be related, we can consolidate. But the symptom "lots of git requests" can have a lot of different possible causes.
OK let's move these comments then to this https://github.com/argoproj/argo-cd/issues/12878 for the case we describing @andrleite and I
two different status entries were fighting over the same key
Can you clarify what "status entries" means here?
Matt might be talking about status.applicationStatus: https://github.com/argoproj/argo-cd/issues/15297/. Though, this is a bug where the status flips constantly because the appset isn't a progressive sync type and it ends up getting processed and unprocessed by progressive sync logic. It does put a lot of load on the Argo system.
Before and after turning Off Progressive sync for an Argo instance with no appsets opted into Progressive Sync
This seems to indicate the load is Progressive Sync logic causing large processing load on all AppSets.
The PR to fix this is https://github.com/argoproj/argo-cd/pull/15299
Checklist:
argocd version
.Describe the bug After applying applicationSet manifests with enable-progressive-syncs flag, the ApplicationSet controller sends git fetch requests constantly.
To Reproduce In our case, the customer used openshift-gitops default argocd to deploy an applicaionSet and in the argocd manifest, it has enable-progressive-syncs flag.
Expected behavior ApplicationSet Controller should only reconcile every 3 mins as the default requeue time is 3 mins. Now it's constantly calling the github.
Version Observed this issue on both v2.6.7 and v2.7.6.
Logs