argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.11k stars 3.21k forks source link

Sidecar for step workflow not working as expected -- use a daemon #13764

Closed kandy closed 1 month ago

kandy commented 1 month ago

Pre-requisites

What happened? What did you expect to happen?

What i did: run argo submit -n argo --log workflow-sidecar.yaml

What happened:

What did you expect to happen?

Version(s)

v3.5.11

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-test-
spec:
  entrypoint: run-app
  templates:
    - name: run-app
      sidecars:
        - name: db
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: "root"
          image: mysql:8.0.36
          args: [ "--default-authentication-plugin=mysql_native_password" ]
      steps:
        - - name: setup-sidecars
            template: setup-sidecars
        - - name: run-echo
            template: img
    - name: setup-sidecars
      container:
        image: alpine:3.18
        command: [sh, -c]
        # Try to read from nginx web server until it comes up
        args:
          - |
            function wait_for() {
              echo -n '.';
              until "$@";  do 
                echo -n  "."; 
                sleep 1
              done
              echo
            }
            apk add --no-cache mysql-client 
            wait_for mysql -h 127.0.0.1 -uroot -proot  -e 'select version()'
            mysql -h 127.0.0.1 -uroot -proot -e "CREATE DATABASE a;"
    - name: img
      container:
        image: alpine:3.18
        command: [ sh, -c ]
        args:
          - |
           echo ok

Logs from the workflow controller

time="2024-10-10T21:23:26.417Z" level=info msg="Processing workflow" Phase= ResourceVersion=176270 namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.420Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.420Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.420Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.421Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.421Z" level=info msg="Steps node workflow-test-mcn2z initialized Running" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.421Z" level=info msg="StepGroup node workflow-test-mcn2z-2846631130 initialized Running" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.421Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.421Z" level=info msg="Pod node workflow-test-mcn2z-2730403668 initialized Pending" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.430Z" level=info msg="Created pod: workflow-test-mcn2z[0].setup-sidecars (workflow-test-mcn2z-setup-sidecars-2730403668)" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.430Z" level=info msg="Workflow step group node workflow-test-mcn2z-2846631130 not yet completed" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.430Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.430Z" level=info msg=reconcileAgentPod namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:26.436Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=176273 workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.419Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=176273 namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.419Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.419Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=workflow-test-mcn2z-2730403668 old.message= old.phase=Pending old.progress=0/1 workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.420Z" level=info msg="Workflow step group node workflow-test-mcn2z-2846631130 not yet completed" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.420Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.420Z" level=info msg=reconcileAgentPod namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:23:36.427Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=176301 workflow=workflow-test-mcn2z
time="2024-10-10T21:24:48.019Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=176301 namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:24:48.019Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=workflow-test-mcn2z
time="2024-10-10T21:24:48.020Z" level=info msg="node unchanged" namespace=argo nodeID=workflow-test-mcn2z-2730403668 workflow=workflow-test-mcn2z
time="2024-10-10T21:24:48.020Z" level=info msg="Workflow step group node workflow-test-mcn2z-2846631130 not yet completed" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:24:48.020Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:24:48.020Z" level=info msg=reconcileAgentPod namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:34:12.019Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=176301 namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:34:12.019Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=workflow-test-mcn2z
time="2024-10-10T21:34:12.019Z" level=info msg="node unchanged" namespace=argo nodeID=workflow-test-mcn2z-2730403668 workflow=workflow-test-mcn2z
time="2024-10-10T21:34:12.020Z" level=info msg="Workflow step group node workflow-test-mcn2z-2846631130 not yet completed" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:34:12.020Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:34:12.020Z" level=info msg=reconcileAgentPod namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:39:59.044Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=176301 namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:39:59.045Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=workflow-test-mcn2z
time="2024-10-10T21:39:59.045Z" level=info msg="node unchanged" namespace=argo nodeID=workflow-test-mcn2z-2730403668 workflow=workflow-test-mcn2z
time="2024-10-10T21:39:59.045Z" level=info msg="Workflow step group node workflow-test-mcn2z-2846631130 not yet completed" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:39:59.046Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:39:59.046Z" level=info msg=reconcileAgentPod namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:43:31.612Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=176301 namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:43:31.612Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=1 workflow=workflow-test-mcn2z
time="2024-10-10T21:43:31.612Z" level=info msg="node unchanged" namespace=argo nodeID=workflow-test-mcn2z-2730403668 workflow=workflow-test-mcn2z
time="2024-10-10T21:43:31.613Z" level=info msg="Workflow step group node workflow-test-mcn2z-2846631130 not yet completed" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:43:31.613Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=workflow-test-mcn2z
time="2024-10-10T21:43:31.613Z" level=info msg=reconcileAgentPod namespace=argo workflow=workflow-test-mcn2z

Logs from in your workflow's wait container

time="2024-10-10T21:23:27.857Z" level=info msg="Starting Workflow Executor" version=v3.5.11
time="2024-10-10T21:23:27.861Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2024-10-10T21:23:27.861Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=argo podName=workflow-test-mcn2z-setup-sidecars-2730403668 templateName=setup-sidecars version="&Version{Version:v3.5.11,BuildDate:2024-09-20T14:10:04Z,GitCommit:25bbb71cced32b671f9ad35f0ffd1f0ddb8226ee,GitTag:v3.5.11,GitTreeState:clean,GoVersion:go1.21.13,Compiler:gc,Platform:linux/arm64,}"
time="2024-10-10T21:23:27.871Z" level=info msg="Starting deadline monitor"
time="2024-10-10T21:28:27.862Z" level=info msg="Alloc=7441 TotalAlloc=13905 Sys=24421 NumGC=5 Goroutines=8"
time="2024-10-10T21:33:27.860Z" level=info msg="Alloc=7454 TotalAlloc=14274 Sys=24421 NumGC=7 Goroutines=8"
time="2024-10-10T21:38:27.883Z" level=info msg="Alloc=7384 TotalAlloc=14648 Sys=24421 NumGC=10 Goroutines=8"
time="2024-10-10T21:43:27.887Z" level=info msg="Alloc=7455 TotalAlloc=14998 Sys=24421 NumGC=12 Goroutines=8"
agilgur5 commented 1 month ago
  templates:
    - name: run-app
      sidecars:
        - name: db
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: "root"
          image: mysql:8.0.36
          args: [ "--default-authentication-plugin=mysql_native_password" ]
     steps:

sidecars and steps cannot be used together like that. Per the documentation, sidecars is used together with container.

If you want a sidecar for multiple steps or want to run a daemon process like MySQL, you can use the Daemon Containers feature instead. Or if you want a sidecar, you can run it per container template (not per step template)

kandy commented 1 month ago

@agilgur5, Thanks for suggestion, I already modified my workflow to use "daemon" is there any reason why there is no validation?

agilgur5 commented 1 month ago

The Controller does some validation and will mark a Workflow as errored if it fails validation.

Full schema validation via CRDs is unfortunately not possible as most of Argo's CRDs exceed the etcd 1MB size limit; see also #11266, #13503, and upstream https://github.com/kubernetes/kubernetes/issues/82292

kandy commented 1 month ago

I see, thank for clarification