actions / actions-runner-controller

Kubernetes controller for GitHub Actions self-hosted runners
Apache License 2.0
4.41k stars 1.04k forks source link

ARC Finalizers Cause a deadlock when uninstalling AutoscalingRunnerSet in ArgoCD #3440

Open rteeling-evernorth opened 2 months ago

rteeling-evernorth commented 2 months ago

Checks

Controller Version

0.9.0

Deployment Method

ArgoCD

Checks

To Reproduce

1. Install the ARC AutoscalingRunnerSet as an ArgoCD app
2. Uninstall ArgoCD app

Describe the bug

The argocd app cannot be deleted because Argo tries to delete resources that are normally deleted by the ARC controller and normal deletion is blocked by the actions.github.com/cleanup-protection finalizer including:

Describe the expected behavior

The chart should cleanly uninstall when i delete the argocd app

Additional Context

I suspect this may be resolvable by helm/argo annotations for the affected resources, I will test with a fork of the helm chart

Controller Logs

My employer's open source contribution policy forbids me from creating public Gists. I will provide redacted logs upon request via zip/tarball.

Runner Pod Logs

My employer's open source contribution policy forbids me from creating public Gists. I will provide redacted logs upon request via zip/tarball.
rteeling-evernorth commented 2 months ago

I was successfully able to fix the issue using argocd annotations argocd.argoproj.io/sync-options: Delete=false and argocd.argoproj.io/sync-wave: "1", I'll open a PR

rteeling-evernorth commented 2 months ago

@nikola-jokic could I trouble you for your two cents on this?

ahatzz11 commented 2 months ago

@rteeling-evernorth Thanks for making an issue for this - I've run into this issue on nearly every upgrade and end up having to turn off syncing, manually delete the finalizers, and then turn syncing back on and it ends up sorting itself out. I also have noticed that even if a new controller comes up (say 0.9.1 when upgrading from 0.9.0) it doesn't seem like that new controller properly handles the finalizers on the scale-set resources that are on the old version.

I do use sync-waves on the top level resources to ensure the gha-runner-scale-set-controller syncs first and the gha-runner-scale-set-figure-linux-standard syncs second:

metadata:
  name: gha-runner-scale-set-controller
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "1" # before scale-set

---

metadata:
  name: gha-runner-scale-set-figure-linux-standard
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "2" # after controller

Looking at your PR you set argocd.argoproj.io/sync-wave: "1" - would that conflict in my case with my controller having the same sync wave? Would it makes sense to go negative on the sync-wave for the autoscalingrunnerset to always ensure that runs first?

rteeling-evernorth commented 2 months ago

@ahatzz11 Had the same issue with having to manually resolve the finalizer deadlock. I've installed the controller in an entirely separate ArgoCD app, since the cardinality of the scale set controller to scale sets is 1:N. I assume GitHub is distributing the controller and runner set charts separately because of this. You should separate them into separate ArgoCD apps for the sake of ease of maintenance if nothing else.

Argo Sync waves are only relevant within the context of an Argo App, so as long as the two charts are managed by separate apps, you won't see any conflict. Once this is done you just uninstall any app for the scale set, then uninstall the controller and you won't have any deadlock issues.

Hope that helps!

ahatzz11 commented 2 months ago

@rteeling-evernorth Cool thanks for the explanation - I do have each chart managed as separate apps so we should be good to go there!

rteeling-evernorth commented 2 months ago

@nikola-jokic

What are your thoughts on this issue and corresponding PR?

ahatzz11 commented 2 months ago

@nikola-jokic @Link- Is it possible to get this issue/PR looked at for the next release? It would be awesome to have upgrades massively simplified for argocd users.

Link- commented 2 months ago

Thanks for adding the appropriate labels. It's on our radar now.

rteeling-evernorth commented 1 month ago

@ahatzz11 did my change work for you? I'd prefer to not cause an "It works on my machine" sort of problem.

rteeling-evernorth commented 1 month ago

@Link- @nikola-jokic Just curious - what's going on with this? The associated PR has been sitting in the hopper for a while now, is there anything i can do within reason to help get this released?

rteeling-evernorth commented 2 weeks ago

@nikola-jokic @Link- @rentziass

This issue has been sitting stale for a few months now. Would it be possible to get the associated PR on the next release?

nikola-jokic commented 2 weeks ago

Hey everyone,

After discussing this issue with the team, we decided not to apply annotations at this time. We are aware that many of you are using ArgoCD to manage ARC and scale sets, but we don't have capacity to support it at this time. We would recommend that you maintain your own chart if it requires such modifications until we officially support it. Thank you @rteeling-evernorth, for raising the PR to fix it, and thank you for describing this problem so thoroughly! It will help other people apply changes to their own charts.

AlexandreODelisle commented 1 week ago

@nikola-jokic, @rteeling-evernorth ,

Since there is no intention no provide the capacity on ArgoCD. Could a possibility be around the support of annotations on all Kubernetes objects managed by the chart?

In the end, if we are able to set the annotations everywhere, it could be handled in the values, without requiring to fork the chart..

SimonWoidig commented 1 week ago

@nikola-jokic, @rteeling-evernorth ,

Since there is no intention no provide the capacity on ArgoCD. Could a possibility be around the support of annotations on all Kubernetes objects managed by the chart?

In the end, if we are able to set the annotations everywhere, it could be handled in the values, without requiring to fork the chart..

This is the usual solution - giving the choice of annotations to the chart user. By my opinion - best solution for this kind of a problem. :+1: