argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.81k stars 3.17k forks source link

`terminationGracePeriodSeconds` not respected when daemon task terminates #10724

Open MatthewHou opened 1 year ago

MatthewHou commented 1 year ago

Pre-requisites

What happened/what you expected to happen?

In out workflow, we have defined a list of parallel daemon steps to execute in the background. And a suspend step to hold the workflow open. If we resume the suspend step, the workflow completes, which in tern terminates the daemon pods.

However, we noticed that the terminationGracePeriodSeconds setting of the daemon pod does not seem to work. The daemon pods are abruptly killed (otherwise the graceful shutdown message should have printed in the logs) and the last log entry is

level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 143

I've also tested this with manually deleting the pods while tailing the logs. It appears that the terminations were graceful, but the exit status remained the same (143)

2023-03-21T13:23:29.539Z  INFO 23 --- [ netty-shutdown] o.s.b.w.e.n.GracefulShutdown             : Graceful shutdown complete
.....
time="2023-03-21T13:23:30.249Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 143

Version

3.4.5

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: sample-wf
  namespace: argo
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: spotinst.io/node-lifecycle
                operator: In
                values:
                  - od
  entrypoint: main
  onExit: exitHandler
  podMetadata:
    labels:
      app.kubernetes.io/name: sample-wf
  serviceAccountName: sample-wf
  synchronization:
    mutex:
      name: sample-wf
  templates:
    - name: main
      steps:
        - - name: daemon-worker-1
            template: daemon-worker
          - name: daemon-worker-2
            template: daemon-worker
          - name: daemon-worker-3
            template: daemon-worker
        - - name: resume-to-terminate
            template: suspend-workflow
    - name: exitHandler
      steps:
        - - name: cleanup
            template: cleanup
    - container:
        image: <A internal Java application image>
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 1
            memory: 2Gi
      daemon: true
      name: daemon-worker
      podSpecPatch: |
        terminationGracePeriodSeconds: 120
      retryStrategy:
        limit: 3
        retryPolicy: Always
    - name: suspend-workflow
      suspend: {}
    - container:       
        image: <A internal application image to do some clean up>
      name: cleanup
  ttlStrategy:
    secondsAfterCompletion: 2592000
  workflowMetadata:
    annotations:
      workflows.argoproj.io/description: |
        A daemon workflow 
      workflows.argoproj.io/verify.py: |
        assert status["phase"] == "Succeeded"
      workflows.argoproj.io/version: '>= 3.4.5'

Logs from the workflow controller

time="2023-03-21T14:15:35.367Z" level=info msg="Processing workflow" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-1115280525 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-2509645393 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-2402326285 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-876462987 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-3951160202 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-20828911 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-2664114442 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-2897564486 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-1222599633 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-3411814144 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-3884384523 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-1307874671 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.368Z" level=info msg="node unchanged" namespace=argo nodeID=traffic-replay-j67wk-1415635788 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3729819998 are [traffic-replay-j67wk-2402326285]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3713042379 are [traffic-replay-j67wk-2897564486]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3696264760 are [traffic-replay-j67wk-876462987]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3813708093 are [traffic-replay-j67wk-1415635788]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3796930474 are [traffic-replay-j67wk-2509645393]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3780152855 are [traffic-replay-j67wk-3951160202]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3763375236 are [traffic-replay-j67wk-1307874671]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3612376665 are [traffic-replay-j67wk-3411814144]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3595599046 are [traffic-replay-j67wk-1805328325]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1724405802 are [traffic-replay-j67wk-1222599633]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1741183421 are [traffic-replay-j67wk-128590028]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1690850564 are [traffic-replay-j67wk-20828911]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1707628183 are [traffic-replay-j67wk-2664114442]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1657295326 are [traffic-replay-j67wk-1115280525]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1674072945 are [traffic-replay-j67wk-85271880]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-1623740088 are [traffic-replay-j67wk-3884384523]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="Step group node traffic-replay-j67wk-241957830 successful" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="node traffic-replay-j67wk-241957830 phase Running -> Succeeded" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="node traffic-replay-j67wk-241957830 finished: 2023-03-21 14:15:35.36942059 +0000 UTC" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="Outbound nodes of traffic-replay-j67wk-453252987 is [traffic-replay-j67wk-453252987]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="Outbound nodes of traffic-replay-j67wk is [traffic-replay-j67wk-453252987]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="node traffic-replay-j67wk phase Running -> Succeeded" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="node traffic-replay-j67wk finished: 2023-03-21 14:15:35.369487555 +0000 UTC" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="Checking daemoned children of traffic-replay-j67wk" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg=reconcileAgentPod namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.369Z" level=info msg="Running OnExit handler: exitHandler" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.370Z" level=info msg="Steps node traffic-replay-j67wk-1312316718 initialized Running" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.370Z" level=info msg="StepGroup node traffic-replay-j67wk-64104880 initialized Running" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.370Z" level=info msg="Pod node traffic-replay-j67wk-3913033981 initialized Pending" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:35.374Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-2509645393/terminateContainers
time="2023-03-21T14:15:35.374Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-3411814144/terminateContainers
time="2023-03-21T14:15:35.374Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-1307874671/terminateContainers
time="2023-03-21T14:15:35.375Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-1415635788/terminateContainers
time="2023-03-21T14:15:35.375Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2509645393/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.375Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1307874671/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.375Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1415635788/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.375Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-3411814144/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.683Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-3411814144 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:35.683Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-3411814144/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.733Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2509645393 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:35.733Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2509645393/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.740Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1415635788 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:35.741Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1415635788/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:35.812Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1307874671 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:35.812Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1307874671/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:36.196Z" level=info msg="Created pod: traffic-replay-j67wk.onExit[0].delay (traffic-replay-j67wk-delay-3913033981)" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:36.196Z" level=info msg="Workflow step group node traffic-replay-j67wk-64104880 not yet completed" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:36.236Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=36258755 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:36.411Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-3411814144 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:36.411Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-20828911/terminateContainers
time="2023-03-21T14:15:36.411Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-20828911/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:36.723Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-20828911 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:36.723Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-20828911/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:36.796Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1307874671 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:36.797Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-1115280525/terminateContainers
time="2023-03-21T14:15:36.797Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1115280525/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:36.993Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1415635788 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:36.993Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-128590028/terminateContainers
time="2023-03-21T14:15:36.993Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-2402326285/terminateContainers
time="2023-03-21T14:15:36.993Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2402326285/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:37.086Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1115280525 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:37.086Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1115280525/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:37.095Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2509645393 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:37.095Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-85271880/terminateContainers
time="2023-03-21T14:15:37.095Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-3951160202/terminateContainers
time="2023-03-21T14:15:37.095Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-3951160202/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:37.329Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2402326285 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:37.329Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2402326285/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:37.360Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-3951160202 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:37.360Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-3951160202/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:37.913Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-20828911 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:37.913Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-2897564486/terminateContainers
time="2023-03-21T14:15:37.913Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2897564486/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:38.070Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1115280525 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:38.070Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-876462987/terminateContainers
time="2023-03-21T14:15:38.070Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-876462987/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:38.114Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2402326285 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:38.114Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-3884384523/terminateContainers
time="2023-03-21T14:15:38.114Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-3884384523/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:38.311Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-3951160202 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:38.311Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-1222599633/terminateContainers
time="2023-03-21T14:15:38.311Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1222599633/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:38.403Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2897564486 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:38.403Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2897564486/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:38.624Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-876462987 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:38.624Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-876462987/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:38.799Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2897564486 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:38.799Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-1805328325/terminateContainers
time="2023-03-21T14:15:38.799Z" level=info msg="cleaning up pod" action=terminateContainers key=argo/traffic-replay-j67wk-replay-traffic-2664114442/terminateContainers
time="2023-03-21T14:15:38.799Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2664114442/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:39.003Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-3884384523 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:39.004Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-3884384523/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:39.241Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1222599633 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:39.241Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-1222599633/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:39.510Z" level=info msg="signaled container" container=main error="<nil>" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2664114442 stderr= stdout="killing 1 with terminated\n"
time="2023-03-21T14:15:39.510Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/argo/pods/traffic-replay-j67wk-replay-traffic-2664114442/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2023-03-21T14:15:39.870Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-3884384523 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:39.870Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-3411814144/labelPodCompleted
time="2023-03-21T14:15:39.993Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-876462987 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:39.993Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-1222599633/labelPodCompleted
time="2023-03-21T14:15:40.081Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-2897564486/labelPodCompleted
time="2023-03-21T14:15:40.147Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-876462987/labelPodCompleted
time="2023-03-21T14:15:40.304Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-2509645393/labelPodCompleted
time="2023-03-21T14:15:40.611Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-2664114442 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:40.611Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-3884384523/labelPodCompleted
time="2023-03-21T14:15:40.612Z" level=info msg="signaled container" container=wait error="command terminated with exit code 137" namespace=argo pod=traffic-replay-j67wk-replay-traffic-1222599633 stderr="<nil>" stdout="<nil>"
time="2023-03-21T14:15:40.612Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-1415635788/labelPodCompleted
time="2023-03-21T14:15:41.352Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-1307874671/labelPodCompleted
time="2023-03-21T14:15:41.571Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-1115280525/labelPodCompleted
time="2023-03-21T14:15:41.687Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-20828911/labelPodCompleted
time="2023-03-21T14:15:41.746Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-2402326285/labelPodCompleted
time="2023-03-21T14:15:41.882Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-3951160202/labelPodCompleted
time="2023-03-21T14:15:41.889Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-replay-traffic-2664114442/labelPodCompleted
time="2023-03-21T14:15:46.196Z" level=info msg="Processing workflow" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.196Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.196Z" level=info msg="node changed" namespace=argo new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=traffic-replay-j67wk-3913033981 old.message= old.phase=Pending old.progress=0/1 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.197Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.197Z" level=info msg=reconcileAgentPod namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.197Z" level=info msg="Running OnExit handler: exitHandler" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.197Z" level=info msg="Workflow step group node traffic-replay-j67wk-64104880 not yet completed" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:46.236Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=36258923 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.998Z" level=info msg="Processing workflow" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.999Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.999Z" level=info msg="node changed" namespace=argo new.message= new.phase=Running new.progress=0/1 nodeID=traffic-replay-j67wk-3913033981 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.999Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.999Z" level=info msg=reconcileAgentPod namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.999Z" level=info msg="Running OnExit handler: exitHandler" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:56.999Z" level=info msg="Workflow step group node traffic-replay-j67wk-64104880 not yet completed" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:15:57.038Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=36259031 workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.486Z" level=info msg="Processing workflow" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.487Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.487Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=traffic-replay-j67wk-3913033981 old.message= old.phase=Running old.progress=0/1 workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.487Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.488Z" level=info msg=reconcileAgentPod namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.488Z" level=info msg="Running OnExit handler: exitHandler" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.488Z" level=info msg="Step group node traffic-replay-j67wk-64104880 successful" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.488Z" level=info msg="node traffic-replay-j67wk-64104880 phase Running -> Succeeded" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.488Z" level=info msg="node traffic-replay-j67wk-64104880 finished: 2023-03-21 14:16:57.488386887 +0000 UTC" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.489Z" level=info msg="StepGroup node traffic-replay-j67wk-1204835877 initialized Running" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.490Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3913033981 are [traffic-replay-j67wk-3913033981]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.490Z" level=info msg="Pod node traffic-replay-j67wk-1420653623 initialized Pending" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.667Z" level=info msg="Created pod: traffic-replay-j67wk.onExit[1].cleanup (traffic-replay-j67wk-cleanup-1420653623)" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.667Z" level=info msg="Workflow step group node traffic-replay-j67wk-1204835877 not yet completed" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.707Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=36259612 workflow=traffic-replay-j67wk
time="2023-03-21T14:16:57.713Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-delay-3913033981/labelPodCompleted
time="2023-03-21T14:17:07.668Z" level=info msg="Processing workflow" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.668Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.669Z" level=info msg="node changed" namespace=argo new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=traffic-replay-j67wk-1420653623 old.message= old.phase=Pending old.progress=0/1 workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.669Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.669Z" level=info msg=reconcileAgentPod namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.669Z" level=info msg="Running OnExit handler: exitHandler" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.669Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3913033981 are [traffic-replay-j67wk-3913033981]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.669Z" level=info msg="Workflow step group node traffic-replay-j67wk-1204835877 not yet completed" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:07.706Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=36259722 workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.970Z" level=info msg="Processing workflow" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.970Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.970Z" level=info msg="node changed" namespace=argo new.message= new.phase=Succeeded new.progress=0/1 nodeID=traffic-replay-j67wk-1420653623 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg=reconcileAgentPod namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Running OnExit handler: exitHandler" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="SG Outbound nodes of traffic-replay-j67wk-3913033981 are [traffic-replay-j67wk-3913033981]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Step group node traffic-replay-j67wk-1204835877 successful" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="node traffic-replay-j67wk-1204835877 phase Running -> Succeeded" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="node traffic-replay-j67wk-1204835877 finished: 2023-03-21 14:17:24.971605853 +0000 UTC" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Outbound nodes of traffic-replay-j67wk-1420653623 is [traffic-replay-j67wk-1420653623]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Outbound nodes of traffic-replay-j67wk-1312316718 is [traffic-replay-j67wk-1420653623]" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="node traffic-replay-j67wk-1312316718 phase Running -> Succeeded" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="node traffic-replay-j67wk-1312316718 finished: 2023-03-21 14:17:24.97167654 +0000 UTC" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Checking daemoned children of traffic-replay-j67wk-1312316718" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Updated phase Running -> Succeeded" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Marking workflow completed" namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.971Z" level=info msg="Checking daemoned children of " namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.972Z" level=info msg="Lock has been released by argo/traffic-replay-j67wk. Available locks: 1" mutex=argo/Mutex/traffic-replay
time="2023-03-21T14:17:24.972Z" level=info msg="argo/traffic-replay-j67wk released a lock from argo/Mutex/traffic-replay"
time="2023-03-21T14:17:24.972Z" level=info msg="Released all acquired locks" key=traffic-replay-j67wk namespace=argo workflow=traffic-replay-j67wk
time="2023-03-21T14:17:24.977Z" level=info msg="cleaning up pod" action=deletePod key=argo/traffic-replay-j67wk-1340600742-agent/deletePod
time="2023-03-21T14:17:25.352Z" level=info msg="Workflow update successful" namespace=argo phase=Succeeded resourceVersion=36259896 workflow=traffic-replay-j67wk
time="2023-03-21T14:17:25.355Z" level=info msg="Queueing Succeeded workflow argo/traffic-replay-j67wk for delete in 719h59m59s due to TTL"
time="2023-03-21T14:17:25.361Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/traffic-replay-j67wk-cleanup-1420653623/labelPodCompleted
time="2023-03-21T14:17:36.412Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-3411814144/killContainers
time="2023-03-21T14:17:36.797Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-1307874671/killContainers
time="2023-03-21T14:17:36.994Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-1415635788/killContainers
time="2023-03-21T14:17:37.095Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-2509645393/killContainers
time="2023-03-21T14:17:37.913Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-20828911/killContainers
time="2023-03-21T14:17:38.071Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-1115280525/killContainers
time="2023-03-21T14:17:38.114Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-2402326285/killContainers
time="2023-03-21T14:17:38.312Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-3951160202/killContainers
time="2023-03-21T14:17:38.800Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-2897564486/killContainers
time="2023-03-21T14:17:39.871Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-3884384523/killContainers
time="2023-03-21T14:17:39.994Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-876462987/killContainers
time="2023-03-21T14:17:40.612Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-2664114442/killContainers
time="2023-03-21T14:17:40.613Z" level=info msg="cleaning up pod" action=killContainers key=argo/traffic-replay-j67wk-replay-traffic-1222599633/killContainers
time="2023-03-21T14:20:46.001Z" level=info msg="Queueing Succeeded workflow argo/traffic-replay-8qjqv for delete in 717h52m26s due to TTL"
time="2023-03-21T14:20:46.002Z" level=info msg="Queueing Succeeded workflow argo/traffic-replay-j67wk for delete in 719h56m38s due to TTL"
time="2023-03-21T14:20:46.003Z" level=info msg="Queueing Succeeded workflow argo/traffic-replay-zbkjw for delete in 717h27m49s due to TTL"
time="2023-03-21T14:20:46.004Z" level=info msg="Queueing Succeeded workflow argo/traffic-replay-whkpw for delete in 710h46m11s due to TTL"

Logs from in your workflow's wait container

time="2023-03-20T23:43:29.408Z" level=info msg="Starting Workflow Executor" version=v3.4.5
time="2023-03-20T23:43:29.411Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2023-03-20T23:43:29.411Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=argo podName=pai-risk-traffic-replay-whkpw-replay-traffic-471101247 template="{\"name\":\"replay-traffic\",\"inputs\":{},\"outputs\":{},\"metadata\":{},\"daemon\":true,\"container\":{\"name\":\"\",\"image\":\"968032760053.dkr.ecr.us-west-1.amazonaws.com/pi-traffic-replicator:feat-traffic-replication-fire-and-forget-1679274608-b523e9f\",\"envFrom\":[{\"configMapRef\":{\"name\":\"pai-risk-traffic-replay-k58gg5m82m\"}}],\"env\":[{\"name\":\"KAFKA_CONSUMER_GROUP\",\"value\":\"replay-f45e72f7-4ae7-40c3-9ccf-000415edf2c8\"},{\"name\":\"JAVA_OPTS\",\"value\":\"-Xmx1500m -Xms1500m -XX:+UseG1GC\"}],\"resources\":{\"limits\":{\"cpu\":\"1\",\"memory\":\"2Gi\"},\"requests\":{\"cpu\":\"1\",\"memory\":\"2Gi\"}}},\"retryStrategy\":{\"limit\":3,\"retryPolicy\":\"Always\"},\"podSpecPatch\":\"terminationGracePeriodSeconds: 60\\n\"}" version="&Version{Version:v3.4.5,BuildDate:2023-02-07T12:34:55Z,GitCommit:1253f443baa8ad1610d2e62ec26ecdc85fe1b837,GitTag:v3.4.5,GitTreeState:clean,GoVersion:go1.18.10,Compiler:gc,Platform:linux/amd64,}"
time="2023-03-20T23:43:29.411Z" level=info msg="Starting deadline monitor"
time="2023-03-20T23:48:29.412Z" level=info msg="Alloc=6148 TotalAlloc=12458 Sys=24274 NumGC=5 Goroutines=7"
time="2023-03-20T23:53:29.411Z" level=info msg="Alloc=6171 TotalAlloc=12569 Sys=24274 NumGC=7 Goroutines=7"
time="2023-03-20T23:58:29.412Z" level=info msg="Alloc=6151 TotalAlloc=12678 Sys=24274 NumGC=10 Goroutines=7"
time="2023-03-21T00:03:29.412Z" level=info msg="Alloc=6171 TotalAlloc=12786 Sys=24274 NumGC=12 Goroutines=7"
time="2023-03-21T00:08:29.412Z" level=info msg="Alloc=6152 TotalAlloc=12897 Sys=24274 NumGC=15 Goroutines=7"
time="2023-03-21T00:13:29.412Z" level=info msg="Alloc=6172 TotalAlloc=13005 Sys=24274 NumGC=17 Goroutines=7"
time="2023-03-21T00:18:29.412Z" level=info msg="Alloc=6152 TotalAlloc=13114 Sys=24274 NumGC=20 Goroutines=7"
time="2023-03-21T00:23:29.411Z" level=info msg="Alloc=6173 TotalAlloc=13224 Sys=24274 NumGC=22 Goroutines=7"
time="2023-03-21T00:28:29.411Z" level=info msg="Alloc=6153 TotalAlloc=13332 Sys=24274 NumGC=25 Goroutines=7"
time="2023-03-21T00:33:29.411Z" level=info msg="Alloc=6173 TotalAlloc=13440 Sys=24274 NumGC=27 Goroutines=7"
time="2023-03-21T00:38:29.411Z" level=info msg="Alloc=6153 TotalAlloc=13549 Sys=24274 NumGC=30 Goroutines=7"
time="2023-03-21T00:43:29.411Z" level=info msg="Alloc=6173 TotalAlloc=13657 Sys=24274 NumGC=32 Goroutines=7"
time="2023-03-21T00:48:29.411Z" level=info msg="Alloc=6153 TotalAlloc=13765 Sys=24274 NumGC=35 Goroutines=7"
time="2023-03-21T00:53:29.412Z" level=info msg="Alloc=6172 TotalAlloc=13874 Sys=24274 NumGC=37 Goroutines=7"
time="2023-03-21T00:58:29.411Z" level=info msg="Alloc=6152 TotalAlloc=13982 Sys=24274 NumGC=40 Goroutines=7"
time="2023-03-21T01:03:29.411Z" level=info msg="Alloc=6171 TotalAlloc=14090 Sys=24274 NumGC=42 Goroutines=7"
time="2023-03-21T01:08:29.411Z" level=info msg="Alloc=6151 TotalAlloc=14198 Sys=24274 NumGC=45 Goroutines=7"
time="2023-03-21T01:13:29.412Z" level=info msg="Alloc=6171 TotalAlloc=14307 Sys=24274 NumGC=47 Goroutines=7"
time="2023-03-21T01:18:29.412Z" level=info msg="Alloc=6151 TotalAlloc=14415 Sys=24274 NumGC=50 Goroutines=7"
time="2023-03-21T01:23:29.412Z" level=info msg="Alloc=6171 TotalAlloc=14523 Sys=24274 NumGC=52 Goroutines=7"
time="2023-03-21T01:28:29.412Z" level=info msg="Alloc=6151 TotalAlloc=14632 Sys=24274 NumGC=55 Goroutines=7"
time="2023-03-21T01:33:29.412Z" level=info msg="Alloc=6171 TotalAlloc=14740 Sys=24274 NumGC=57 Goroutines=7"
time="2023-03-21T01:38:29.412Z" level=info msg="Alloc=6151 TotalAlloc=14849 Sys=24274 NumGC=60 Goroutines=7"
time="2023-03-21T01:43:29.412Z" level=info msg="Alloc=6172 TotalAlloc=14958 Sys=24274 NumGC=62 Goroutines=7"
time="2023-03-21T01:48:29.412Z" level=info msg="Alloc=6151 TotalAlloc=15066 Sys=24274 NumGC=65 Goroutines=7"
time="2023-03-21T01:53:29.411Z" level=info msg="Alloc=6171 TotalAlloc=15175 Sys=24274 NumGC=67 Goroutines=7"
time="2023-03-21T01:58:29.412Z" level=info msg="Alloc=6151 TotalAlloc=15284 Sys=24274 NumGC=70 Goroutines=7"
time="2023-03-21T02:03:29.411Z" level=info msg="Alloc=6172 TotalAlloc=15392 Sys=24274 NumGC=72 Goroutines=7"
time="2023-03-21T02:08:29.411Z" level=info msg="Alloc=6152 TotalAlloc=15501 Sys=24274 NumGC=75 Goroutines=7"
time="2023-03-21T02:13:29.412Z" level=info msg="Alloc=6172 TotalAlloc=15610 Sys=24274 NumGC=77 Goroutines=7"
time="2023-03-21T02:18:29.412Z" level=info msg="Alloc=6152 TotalAlloc=15719 Sys=24274 NumGC=80 Goroutines=7"
time="2023-03-21T02:23:29.412Z" level=info msg="Alloc=6172 TotalAlloc=15827 Sys=24274 NumGC=82 Goroutines=7"
time="2023-03-21T02:28:29.411Z" level=info msg="Alloc=6152 TotalAlloc=15936 Sys=24274 NumGC=85 Goroutines=7"
time="2023-03-21T02:33:29.412Z" level=info msg="Alloc=6171 TotalAlloc=16044 Sys=24274 NumGC=87 Goroutines=7"
time="2023-03-21T02:38:29.411Z" level=info msg="Alloc=6151 TotalAlloc=16152 Sys=24274 NumGC=90 Goroutines=7"
time="2023-03-21T02:43:29.412Z" level=info msg="Alloc=6171 TotalAlloc=16261 Sys=24274 NumGC=92 Goroutines=7"
time="2023-03-21T02:48:29.412Z" level=info msg="Alloc=6151 TotalAlloc=16369 Sys=24274 NumGC=95 Goroutines=7"
time="2023-03-21T02:53:29.412Z" level=info msg="Alloc=6171 TotalAlloc=16477 Sys=24274 NumGC=97 Goroutines=7"
time="2023-03-21T02:58:29.412Z" level=info msg="Alloc=6150 TotalAlloc=16585 Sys=24274 NumGC=100 Goroutines=7"
time="2023-03-21T03:03:29.411Z" level=info msg="Alloc=6170 TotalAlloc=16694 Sys=24274 NumGC=102 Goroutines=7"
time="2023-03-21T03:08:29.412Z" level=info msg="Alloc=6150 TotalAlloc=16803 Sys=24274 NumGC=105 Goroutines=7"
time="2023-03-21T03:13:29.411Z" level=info msg="Alloc=6170 TotalAlloc=16911 Sys=24274 NumGC=107 Goroutines=7"
time="2023-03-21T03:18:29.411Z" level=info msg="Alloc=6150 TotalAlloc=17019 Sys=24274 NumGC=110 Goroutines=7"
time="2023-03-21T03:23:29.412Z" level=info msg="Alloc=6170 TotalAlloc=17128 Sys=24274 NumGC=112 Goroutines=7"
time="2023-03-21T03:28:29.412Z" level=info msg="Alloc=6150 TotalAlloc=17236 Sys=24274 NumGC=115 Goroutines=7"
time="2023-03-21T03:33:29.412Z" level=info msg="Alloc=6170 TotalAlloc=17345 Sys=24274 NumGC=117 Goroutines=7"
time="2023-03-21T03:38:29.412Z" level=info msg="Alloc=6149 TotalAlloc=17453 Sys=24274 NumGC=120 Goroutines=7"
time="2023-03-21T03:43:29.412Z" level=info msg="Alloc=6169 TotalAlloc=17561 Sys=24274 NumGC=122 Goroutines=7"
time="2023-03-21T03:48:29.411Z" level=info msg="Alloc=6148 TotalAlloc=17669 Sys=24274 NumGC=125 Goroutines=7"
time="2023-03-21T03:53:29.411Z" level=info msg="Alloc=6167 TotalAlloc=17777 Sys=24274 NumGC=127 Goroutines=7"
time="2023-03-21T03:58:29.412Z" level=info msg="Alloc=6147 TotalAlloc=17886 Sys=24274 NumGC=130 Goroutines=7"
time="2023-03-21T04:03:29.412Z" level=info msg="Alloc=6167 TotalAlloc=17994 Sys=24274 NumGC=132 Goroutines=7"
time="2023-03-21T04:08:29.411Z" level=info msg="Alloc=6146 TotalAlloc=18102 Sys=24274 NumGC=135 Goroutines=7"
time="2023-03-21T04:13:29.411Z" level=info msg="Alloc=6167 TotalAlloc=18211 Sys=24274 NumGC=137 Goroutines=7"
time="2023-03-21T04:18:29.411Z" level=info msg="Alloc=6146 TotalAlloc=18319 Sys=24274 NumGC=140 Goroutines=7"
time="2023-03-21T04:23:29.412Z" level=info msg="Alloc=6166 TotalAlloc=18428 Sys=24274 NumGC=142 Goroutines=7"
time="2023-03-21T04:28:29.412Z" level=info msg="Alloc=6145 TotalAlloc=18536 Sys=24274 NumGC=145 Goroutines=7"
time="2023-03-21T04:33:29.411Z" level=info msg="Alloc=6166 TotalAlloc=18644 Sys=24274 NumGC=147 Goroutines=7"
time="2023-03-21T04:38:29.411Z" level=info msg="Alloc=6145 TotalAlloc=18753 Sys=24274 NumGC=150 Goroutines=7"
time="2023-03-21T04:43:29.412Z" level=info msg="Alloc=6166 TotalAlloc=18861 Sys=24274 NumGC=152 Goroutines=7"
time="2023-03-21T04:48:29.411Z" level=info msg="Alloc=6145 TotalAlloc=18969 Sys=24274 NumGC=155 Goroutines=7"
time="2023-03-21T04:53:29.412Z" level=info msg="Alloc=6165 TotalAlloc=19077 Sys=24274 NumGC=157 Goroutines=7"
time="2023-03-21T04:58:29.411Z" level=info msg="Alloc=6184 TotalAlloc=19185 Sys=24274 NumGC=159 Goroutines=7"
time="2023-03-21T05:03:29.412Z" level=info msg="Alloc=6164 TotalAlloc=19294 Sys=24274 NumGC=162 Goroutines=7"
time="2023-03-21T05:05:14.750Z" level=info msg="Main container completed" error="<nil>"
time="2023-03-21T05:05:14.750Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2023-03-21T05:05:14.750Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2023-03-21T05:05:14.750Z" level=info msg="No output parameters"
time="2023-03-21T05:05:14.750Z" level=info msg="No output artifacts"
time="2023-03-21T05:05:14.750Z" level=info msg="Deadline monitor stopped"
time="2023-03-21T05:05:14.750Z" level=info msg="Alloc=6178 TotalAlloc=19349 Sys=24274 NumGC=163 Goroutines=5"
tico24 commented 1 year ago

Context for whoever picks this up: https://cloud-native.slack.com/archives/C01QW9QSSSK/p1679402429354709

jplrts commented 2 months ago

Any progress on this?