Open fabry00 opened 1 year ago
if I set in the Application yaml
retry:
limit: 1
backoff:
duration: 1s
factor: 1
maxDuration: 1m
Then ArgoCD retries 2 times and then it fails. Not perfect but better than waiting 10 minutes. But If I do not set the rety, ArgoCD UI shows: Retry Disabled so I think it's a bug because it retries 5 times
I'm having trouble running down the code, but to me it looks like an unchecked "Retry" option in the UI just means "use the default or app-configured retry config." It doesn't seem like the unchecked box means "don't retry."
It doesn't seem like the unchecked box means "don't retry."
Not sure I understood, but from the UI you can clearly see "Retry Disabled" in my screenshot
Yes, I believe there is a discrepancy between the UI and the default behavior here.
The default is to retry 5 times, but the UI assumes it to be disabled when not configured. Potentially, the default has been changed from "Disabled if not specified" to "Retry 5 times if not specified". And it seems only to apply to autosync: https://github.com/argoproj/argo-cd/blob/28ef0961b34f098eb6b7631ecc65f0a3a42ff85f/controller/appcontroller.go#L1723-L1735
So, FWIW, this is a bug. For autosync, you can't seem to disable retries. If you don't set the retries, the default will be assumed, if you do set the retries, it will retry at least once (because 0 means unlimited). This is not solvable in an elegant way without breaking existing behaviour (people may rely on it), so what comes to my mind is to give the retry stanza a enabled
(or disabled
) field, so you could do something like:
retry:
enabled: false
Describe the bug
I have an ArgoCD Application and from ArgoCD UI I see that the RETRY OPTIONS is "Retry disabled"
I have a Job with a pre-sync hook and the backoff limit of the Job is 2
If the Job fails I see that ArgoCd is ignoring the retry options and it continues to sync (deletes the job and redeploy it) for 10 minutes
I need ArgoCD to mark the Sync as failed as soon as the job fails, but it took more than 10 minutes to ArgoCD to stop synchronizing the current commit which was making the Job failing and start synchronizing the last commit with the fix.
In the image below, you can see that ArgoCD has already tried to Sync 5 times the application even if the Retry is disabled and there is new commit waiting to be sync (this second commit should fix the job issue)
Application:
To Reproduce
Create an application.yaml CR
Create a Job in your helm char with
Commit something that let the job fails, as soon as argoCD start the sync, revert the previous commit in order to fix the issue. ArgoCD will perform multiple retries even if the retry is disabled
Expected behavior
Fail the sync as soon as the job fail and pick up the next commit in the queue
Version
Logs