Open tooptoop4 opened 1 month ago
This issue has been automatically marked as stale because it has not had recent activity and needs more information. It will be closed if no further activity occurs.
maybe similar to https://github.com/argoproj/argo-workflows/issues/13780
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
my wf is a dag and has snippet like:
There have been 100s of successful runs of this workflow but only 1 run where it failed (my own code inside tried to parse the section after last hyphen as int) as the parameter that went into the pod did not have the
{{retries}}
value substituted, strangely the ui shows it as replaced with the 0 for retries but pod logs below show that it wasn'tcontroller logs:
my pod logs show that the
{{retries}}
was not properly replaced with 0:time=\"2024-10-21T20:02:08.957Z\" level=info msg=\"Executor initialized\" deadline=\"2024-10-redact 09:00:45 +0000 UTC\" includeScriptOutput=false namespace=redact podName=redact template=\"{\\\"name\\\":\\\"redact\\\",\\\"inputs\\\":{\\\"parameters\\\":[{\\\"name\\\":\\\"job_name\\\",\\\"value\\\":\\\"redactwf-redact-{{retries}}\\\"}
my resourcequota limits were not hit but the controller was busy with 100s of "cleaning up pod" from a different workflow
may be related? https://github.com/argoproj/argo-workflows/blob/v3.4.11/util/template/expression_template.go#L35-L40 seems like allowUnresolved is passed in as true at https://github.com/argoproj/argo-workflows/blame/v3.4.11/workflow/common/util.go#L286 https://github.com/argoproj/argo-workflows/issues/13123 but i don't use podspecpatch
it was able to replace
{{workflow.name}}
but not{{retries}}
Version(s)
3.4.11
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
Logs from in your workflow's wait container