argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.89k stars 3.17k forks source link

ternary expression is not evaluated when using a workflow parameter #12317

Open ndamclean opened 9 months ago

ndamclean commented 9 months ago

Pre-requisites

What happened/what did you expect to happen?

When I reference a workflow parameter in an expression {{= ... }} the entire expression is not parsed and is treated as a string.

For example, the following workflow uses a ternary expression to determine whether to run true or false based on a workflow parameter called succeed.

I would expect the attached workflow to exit successfully (exit code 0) if succeed is set to true and to fail if it is set to false.

However, Argo treats the entire expression as a string without parsing it, resulting in an error message:

Error: failed to find name in PATH: exec: "{{= workflow.parameters['succeed'] ? 'true' : 'false' }}": executable file not found in $PATH
failed to find name in PATH: exec: "{{= workflow.parameters['succeed'] ? 'true' : 'false' }}": executable file not found in $PATH

If I replace workflow.parameters['succeed'] with a hard-coded true or false value, the workflow works as expected.

Furthermore, if Argo is unable to parse an expression, I would expect argo submit or argo lint to give an error instead of simply treating the unparsed expression as a string. This behaviour makes it very difficult to debug expressions in workflows.

Version

v3.4.3

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

---
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: test-workflow-
  labels:
    name: test-workflow
spec:
  arguments:
    parameters:
      - name: succeed
        value: true
  entrypoint: main
  templates:
    - name: main
      container:
        image: "alpine:3.18"
        imagePullPolicy: "Always"
        command: 
            - "{{= workflow.parameters['succeed'] ? 'true' : 'false' }}"

Logs from the workflow controller

time="2023-12-04T19:48:18.108Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:18.113Z" level=info msg="Updated phase  -> Running" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:18.114Z" level=info msg="Pod node test-workflowscj9x initialized Pending" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:18.141Z" level=info msg="Created pod: test-workflowscj9x (test-workflowscj9x)" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:18.141Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:18.141Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:18.154Z" level=info msg="Workflow update successful" namespace=resilience-data-eng phase=Running resourceVersion=1802452434 workflow=test-workflowscj9x
time="2023-12-04T19:48:28.143Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:28.143Z" level=info msg="Task-result reconciliation" namespace=resilience-data-eng numObjs=0 workflow=test-workflowscj9x
time="2023-12-04T19:48:28.144Z" level=info msg="node changed" namespace=resilience-data-eng new.message="Unschedulable: 0/34 nodes are available: 1 node(s) had untolerated taint {role: exposure-modeling}, 1 node(s) had untolerated taint {role: flood-weather-data}, 1 node(s) had untolerated taint {role: gatekeeper}, 1 node(s) had untolerated taint {role: impact-resilience}, 1 node(s) had untolerated taint {role: ingress-controller}, 2 node(s) had untolerated taint {role: flood-common}, 2 node(s) had untolerated taint {role: hazard-map-service}, 3 node(s) had untolerated taint {role: cvat}, 3 node(s) had untolerated taint {role: dna-api}, 3 node(s) had untolerated taint {role: dna-platform}, 3 node(s) had untolerated taint {role: platform}, 4 node(s) had untolerated taint {components.gke.io/gke-managed-components: NoSchedule}, 4 node(s) had untolerated taint {role: jhub}, 5 node(s) had untolerated taint {role: system}. preemption: 0/34 nodes are available: 34 Preemption is not helpful for scheduling." new.phase=Pending new.progress=0/1 nodeID=test-workflowscj9x old.message= old.phase=Pending old.progress=0/1 workflow=test-workflowscj9x
time="2023-12-04T19:48:28.144Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:28.144Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:28.160Z" level=info msg="Workflow update successful" namespace=resilience-data-eng phase=Running resourceVersion=1802452581 workflow=test-workflowscj9x
time="2023-12-04T19:48:38.161Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:38.162Z" level=info msg="Task-result reconciliation" namespace=resilience-data-eng numObjs=0 workflow=test-workflowscj9x
time="2023-12-04T19:48:38.162Z" level=info msg="node unchanged" namespace=resilience-data-eng nodeID=test-workflowscj9x workflow=test-workflowscj9x
time="2023-12-04T19:48:38.162Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:48:38.162Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:49:40.063Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:49:40.064Z" level=info msg="Task-result reconciliation" namespace=resilience-data-eng numObjs=0 workflow=test-workflowscj9x
time="2023-12-04T19:49:40.064Z" level=info msg="node changed" namespace=resilience-data-eng new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=test-workflowscj9x old.message="Unschedulable: 0/34 nodes are available: 1 node(s) had untolerated taint {role: exposure-modeling}, 1 node(s) had untolerated taint {role: flood-weather-data}, 1 node(s) had untolerated taint {role: gatekeeper}, 1 node(s) had untolerated taint {role: impact-resilience}, 1 node(s) had untolerated taint {role: ingress-controller}, 2 node(s) had untolerated taint {role: flood-common}, 2 node(s) had untolerated taint {role: hazard-map-service}, 3 node(s) had untolerated taint {role: cvat}, 3 node(s) had untolerated taint {role: dna-api}, 3 node(s) had untolerated taint {role: dna-platform}, 3 node(s) had untolerated taint {role: platform}, 4 node(s) had untolerated taint {components.gke.io/gke-managed-components: NoSchedule}, 4 node(s) had untolerated taint {role: jhub}, 5 node(s) had untolerated taint {role: system}. preemption: 0/34 nodes are available: 34 Preemption is not helpful for scheduling." old.phase=Pending old.progress=0/1 workflow=test-workflowscj9x
time="2023-12-04T19:49:40.064Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:49:40.065Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:49:40.079Z" level=info msg="Workflow update successful" namespace=resilience-data-eng phase=Running resourceVersion=1802453743 workflow=test-workflowscj9x
time="2023-12-04T19:49:50.079Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:49:50.080Z" level=info msg="Task-result reconciliation" namespace=resilience-data-eng numObjs=0 workflow=test-workflowscj9x
time="2023-12-04T19:49:50.080Z" level=info msg="node unchanged" namespace=resilience-data-eng nodeID=test-workflowscj9x workflow=test-workflowscj9x
time="2023-12-04T19:49:50.080Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:49:50.080Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:00.769Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:00.770Z" level=info msg="Task-result reconciliation" namespace=resilience-data-eng numObjs=0 workflow=test-workflowscj9x
time="2023-12-04T19:50:00.770Z" level=info msg="node unchanged" namespace=resilience-data-eng nodeID=test-workflowscj9x workflow=test-workflowscj9x
time="2023-12-04T19:50:00.770Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:00.771Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.839Z" level=info msg="Processing workflow" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="Task-result reconciliation" namespace=resilience-data-eng numObjs=0 workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="node changed" namespace=resilience-data-eng new.message="Error (exit code 64): failed to find name in PATH: exec: \"{{= workflow.parameters['succeed'] ? 'true' : 'false' }}\": executable file not found in $PATH" new.phase=Failed new.progress=0/1 nodeID=test-workflowscj9x old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="TaskSet Reconciliation" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg=reconcileAgentPod namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="Updated phase Running -> Failed" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="Updated message  -> Error (exit code 64): failed to find name in PATH: exec: \"{{= workflow.parameters['succeed'] ? 'true' : 'false' }}\": executable file not found in $PATH" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="Marking workflow completed" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="Marking workflow as pending archiving" namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.840Z" level=info msg="Checking daemoned children of " namespace=resilience-data-eng workflow=test-workflowscj9x
time="2023-12-04T19:50:16.846Z" level=info msg="cleaning up pod" action=deletePod key=resilience-data-eng/test-workflowscj9x-1340600742-agent/deletePod
time="2023-12-04T19:50:16.853Z" level=info msg="Workflow update successful" namespace=resilience-data-eng phase=Failed resourceVersion=1802454326 workflow=test-workflowscj9x
time="2023-12-04T19:50:16.858Z" level=info msg="archiving workflow" namespace=resilience-data-eng uid=4d9628b6-9a39-4fce-8eda-3d911d45bccc workflow=test-workflowscj9x
time="2023-12-04T19:50:16.863Z" level=info msg="cleaning up pod" action=labelPodCompleted key=resilience-data-eng/test-workflowscj9x/labelPodCompleted
time="2023-12-04T19:50:16.903Z" level=info msg="Queueing Failed workflow resilience-data-eng/test-workflowscj9x for delete in 12h0m0s due to TTL"

Logs from in your workflow's wait container

time="2023-12-04T19:50:05.366Z" level=info msg="Starting Workflow Executor" version=v3.4.3
time="2023-12-04T19:50:05.369Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2023-12-04T19:50:05.369Z" level=info msg="Executor initialized" deadline="2023-12-14 19:48:18 +0000 UTC" includeScriptOutput=false namespace=resilience-data-eng podName=test-workflowscj9x template="{\"name
\":\"main\",\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.18\",\"command\":[\"{{= workflow.parameters['succeed'] ? 'true' : 'false' }}\"],\"resources\":{},\"image
PullPolicy\":\"Always\"},\"archiveLocation\":{\"archiveLogs\":true,\"gcs\":{\"bucket\":\"argo-dev-us-central1-artifacts\",\"key\":\"resilience-data-eng/test-workflowscj9x/test-workflowscj9x/\"}}}" version="&Vers
ion{Version:v3.4.3,BuildDate:2022-10-31T05:40:15Z,GitCommit:eddb1b78407adc72c08b4ed6be8f52f2a1f1316a,GitTag:v3.4.3,GitTreeState:clean,GoVersion:go1.18.7,Compiler:gc,Platform:linux/amd64,}"
time="2023-12-04T19:50:05.370Z" level=info msg="Starting deadline monitor"
time="2023-12-04T19:50:06.370Z" level=info msg="Main container completed" error="<nil>"
time="2023-12-04T19:50:06.370Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2023-12-04T19:50:06.370Z" level=info msg="No output parameters"
time="2023-12-04T19:50:06.370Z" level=info msg="No output artifacts"
time="2023-12-04T19:50:06.370Z" level=error msg="executor error: open /var/run/argo/ctr/main/combined: no such file or directory"
time="2023-12-04T19:50:06.370Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2023-12-04T19:50:06.370Z" level=info msg="Deadline monitor stopped"
time="2023-12-04T19:50:06.370Z" level=info msg="Alloc=6579 TotalAlloc=12175 Sys=19154 NumGC=4 Goroutines=7"
time="2023-12-04T19:50:06.371Z" level=fatal msg="open /var/run/argo/ctr/main/combined: no such file or directory"
agilgur5 commented 6 months ago

{{= ... }}

This looks like it could be a whitespace issue. Have you tried removing the space after the =?

v3.4.3

This is also an older version of Argo and there have been fixes to templating, so I would try a newer version as well.

ndamclean commented 6 months ago

{{= ... }}

This looks like it could be a whitespace issue. Have you tried removing the space after the =?

@agilgur5 I did test removing the whitespace and still saw the same error. Extra whitespace does not make a difference for any other expressions I have tested in Argo.

v3.4.3

This is also an older version of Argo and there have been fixed to templating, so I would try a newer version as well.

I just tried this using v3.4.16 and still saw the same error.