argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
16.74k stars 5.07k forks source link

Stuck applicationset progressive rollout #12202

Open billyshambrook opened 1 year ago

billyshambrook commented 1 year ago

Checklist:

Describe the bug

A progressive rollout enabled ApplicationSet sometimes get's stuck between rollout steps.

Not sure if this is related, but I have noticed that the applicationset conditions seem to continuously flip between the following, seems like the controller does not append these but keeps overwriting itself:

...
  - lastTransitionTime: '2023-01-29T22:35:24Z'
    message: Successfully generated parameters for all Applications
    reason: ParametersGenerated
    status: 'True'
    type: ParametersGenerated
...
...
  - lastTransitionTime: '2023-01-29T22:34:44Z'
    message: ApplicationSet Rollout Rollout started
    reason: ErrorOccurred
    status: 'False'
    type: ParametersGenerated
...

After inspecting the code, it seems to be caused by the controller calling r.setApplicationSetStatusCondition here https://github.com/argoproj/argo-cd/blob/master/applicationset/controllers/applicationset_controller.go#L1154 with the paramtersGenerated argument set to false and then the controller within the same reconcile calls the function again here (maybe) https://github.com/argoproj/argo-cd/blob/master/applicationset/controllers/applicationset_controller.go#L277 with the parametersGenerated set to true which overwrites the progressive condition.

To Reproduce

Apply this applicationset.yaml. If the applicationset rollout works, try deleting the applicationset and re-applying it a few times.

Expected behavior

All applications rollout successfully.

Screenshots

Screenshot 2023-01-29 at 2 26 39 PM

Version

argocd: v2.6.0-rc5+e790028
  BuildDate: 2023-01-25T17:57:49Z
  GitCommit: e790028e5cf99d65d6896830fc4ca757c91ce0d5
  GitTreeState: clean
  GoVersion: go1.18.10
  Compiler: gc
  Platform: linux/amd64
thober35 commented 1 year ago

We also observed a similar issue. After debugging the code we found a possible culprit here: https://github.com/argoproj/argo-cd/blob/master/applicationset/controllers/applicationset_controller.go#L1023 App Status is stuck in "pending" even though the sync was successful. The operationPhaseString is "Succeeded" therefore an update to the status is never performed. Possible fix there would be to check for (operationPhaseString == "Succeeded" && !appOutdated). Don't know if this relates to your issue as we did not check the ApplicationSetStatus.

@crenshaw-dev please add label appset/progressive-rollouts. Thanks.

riuvshyn commented 1 year ago

same happens on 2.7.1

mike-serchenia commented 8 months ago

Same on 2.8

bhutkovskyysos commented 8 months ago

any updates on this issue?

vitaly-dt commented 3 months ago

Hi - does anyone have any insights on this one?

grosenba commented 3 months ago

I can only say that I still have the problem with 2.10.4.

thomaspetit commented 3 months ago

Has anyone found a workaround for this one?

I notice this too:

- lastTransitionTime: "2024-03-22T13:01:35Z"
    message: Successfully generated parameters for all Applications
    reason: ApplicationSetUpToDate
    status: "False"
    type: ErrorOccurred

Meanwhile these errors pop-up in the appset controller:

time="2024-03-22T13:07:17Z" level=error msg="unable to set application set status: Operation cannot be fulfilled on applicationsets.argoproj.io \"argocd\": the object has been modified; please apply your changes to the latest version and try again" applicationset=argocd/argocd

The latter seems unrelated to the initial issue logged here but it is interesting to see that the progressive rollout also has issues with ArgoCD being managed by the progressive rollout.

Qwiko commented 3 months ago

I also have this issue on 2.10.4