argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
18.04k stars 5.51k forks source link

ArgoCD postsync job not starting with rollout suspended #19422

Open gbandresm opened 3 months ago

gbandresm commented 3 months ago

Hi I'm working with argo-cd and argo-rollouts and we have a post-sync job which is not executed by argo-cd when I deploy a new version of application and the rollout create the new pods, argo-cd shows this message

waiting for healthy state of argoproj.io/Rollout/core

and the application state is in suspended state, the rollout is paused and argo-cd is trying to sync the application.

image

Logs

status: HPAReplicas: 2 availableReplicas: 2 blueGreen: activeSelector: 8556c667c previewSelector: 575844789d canary: {} conditions:

To Reproduce

Create a new rollout with some job configured as post-sync hook

Expected behavior

The application should be in suspended state but synced

Version

ArgoCD version: 2.8.20 Argo Rollouts version: 1.7.1

gbandresm commented 3 months ago

Hi I was doing more test, and I figured out that the problem resides in rollout status suspended. As it is in this state the application is in state suspended and waits for healthy state. This behaviour can be correct but it is incompatible with postSync jobs. To workaround this I created a customization for ignoring suspended rollout state

resource.customizations: |
    argoproj.io/Rollout:
      health.lua: |        
        hs={ status = "Suspended" }
        if obj.status ~= nil then
          if obj.status.conditions ~= nil then
            for i, condition in ipairs(obj.status.conditions) do`
              if condition.type == "Available" and condition.status ~= "True" then
                if condition.reason == "SomePodsNotReady" then
                  hs.status = "Progressing"
                else
                  hs.status = "Degraded"
                end
                hs.message = condition.message or condition.reason
              end
              if condition.type == "Available" and condition.status == "True" then
                hs.status = "Healthy"
                hs.message = "All instances are available"
              end
            end
          end
        end
        return hs
andrii-korotkov-verkada commented 2 weeks ago

2.10 and below have reached EOL. Can you upgrade it and also Argo Rollouts to 1.7.2+ and let us know if the issue is still present, please?

andrii-korotkov-verkada commented 2 weeks ago

It may be good opening an issue in Argo Rollouts.