argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.08k stars 3.2k forks source link

Controller pods must be deleted and restarted to pick up new retention policy #11194

Open bwmetcalf opened 1 year ago

bwmetcalf commented 1 year ago

Pre-requisites

What happened/what you expected to happen?

Using

% helm list -n argo
NAME            NAMESPACE   REVISION    UPDATED                                 STATUS      CHART                   APP VERSION
argo-workflows  argo        14          2023-06-09 15:48:03.550419949 +0000 UTC deployed    argo-workflows-0.28.2   v3.4.8

After making a change to the helm values and redeploying, the new retention policy settings do not take effect until the controller pods are deleted. It seems the controller code should pick up the new configmap configuration automatically. Perhaps it does but not in a timely manner?

Version

3.4.8

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

We can see this from the output of `argo list`.  After updating the retention policy, the behavior remains the same until the controller pods are cycled.

Logs from the workflow controller

n/a

Logs from in your workflow's wait container

n/a
bwmetcalf commented 1 year ago

I can confirm that the controller does not pick up new settings after at least 2 hours.

sarabala1979 commented 1 year ago

fixed in https://github.com/argoproj/argo-workflows/pull/10218

sarabala1979 commented 1 year ago

can you try with latest

bwmetcalf commented 1 year ago

Thanks @sarabala1979! We'll update and report back. It looks like the code changes will indeed fix the issue.

bwmetcalf commented 1 year ago

@sarabala1979 Is there a release planned soon so we can move this fix to our production environments. I haven't tried latest yet as we prefer to use tagged releases. Just getting back from vacation and following up. Thanks!

terrytangyuan commented 1 year ago

It should be available in v3.4.8

bwmetcalf commented 1 year ago

3.4.8 is the version we are using which exhibited the problem which led to me creating this issue. I see https://github.com/argoproj/argo-workflows/pull/10218 was merged a while back, but it seems to have not fixed our issue.

bwmetcalf commented 1 year ago

It looks like some refactoring has taken place since the above MR as the code now lives here https://github.com/argoproj/argo-workflows/blob/dc56332e6a4de71dd0ec0af9bd8a2b9686e550b4/workflow/controller/controller.go#L406C4-L406C4 and is a bit different from the original MR.

tooptoop4 commented 1 week ago

i think this can be closed, stakater/reloader handles this