argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.92k stars 3.18k forks source link

artifactory main pod log storing using workflow-controller-configmap gives nil pointer panic #5733

Closed heyleke closed 3 years ago

heyleke commented 3 years ago

Summary

I wanted to store the main pod logs of my steps directly on artifactory, so followed the (splintered) docs I found to configure artifactory repo in the workflow-config-map, but I see the wait container (argoexec) crashing

Diagnostics

Using docker executor

What version of Argo Workflows are you running? 3.0.1

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  creationTimestamp: "2021-04-21T21:20:08Z"
  generateName: nested-workflow-
  generation: 5
  labels:
    workflows.argoproj.io/phase: Running
  name: nested-wf-test
  namespace: default
  resourceVersion: "150308175"
  selfLink: /apis/argoproj.io/v1alpha1/namespaces/default/workflows/nested-wf-test
  uid: 8f930f77-8a87-4915-9815-ff97b872e952
spec:
  arguments: {}
  entrypoint: nested-workflow-example
  templates:
  - inputs: {}
    metadata: {}
    name: nested-workflow-example
    outputs: {}
    steps:
    - - arguments:
          parameters:
          - name: excluded_node
            value: ""
        continueOn:
          failed: true
        name: runtb
        template: sleepabit
    - - arguments:
          parameters:
          - name: excluded_node
            value: '{{workflow.outputs.parameters.nodename}}'
        continueOn:
          failed: true
        name: retryrun
        template: sleepabit
        when: '{{steps.runtb.status}} != Succeeded'
  - affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: kubernetes.io/hostname
              operator: NotIn
              values:
              - '{{inputs.parameters.excluded_node}}'
    container:
      args:
      - echo $NODE_NAME > nodename.txt && echo blablabla && sleep 5 && false
      command:
      - sh
      - -c
      env:
      - name: NODE_NAME
        valueFrom:
          fieldRef:
            fieldPath: spec.nodeName
      image: alpine:3.7
      name: ""
      resources: {}
    inputs:
      parameters:
      - name: excluded_node
    metadata: {}
    name: sleepabit
    outputs:
      parameters:
      - globalName: nodename
        name: nodename
        valueFrom:
          path: nodename.txt
    retryStrategy:
      limit: 2
      retryPolicy: OnError
status:
  artifactRepositoryRef:
    default: true
  conditions:
  - status: "False"
    type: PodRunning
  finishedAt: null
  nodes:
    nested-wf-test:
      children:
      - nested-wf-test-4187779359
      displayName: nested-wf-test
      finishedAt: null
      id: nested-wf-test
      name: nested-wf-test
      phase: Running
      progress: 1/2
      startedAt: "2021-04-21T21:20:08Z"
      templateName: nested-workflow-example
      templateScope: local/nested-wf-test
      type: Steps
    nested-wf-test-73877493:
      boundaryID: nested-wf-test
      children:
      - nested-wf-test-1033837250
      displayName: runtb(0)
      finishedAt: "2021-04-21T21:20:24Z"
      hostNodeName: dev11-gsn107-k8s-med-worker-1
      id: nested-wf-test-73877493
      inputs:
        parameters:
        - name: excluded_node
          value: ""
      message: Error (exit code 1)
      name: nested-wf-test[0].runtb(0)
      phase: Failed
      progress: 1/1
      resourcesDuration:
        cpu: 10
        memory: 10
      startedAt: "2021-04-21T21:20:08Z"
      templateName: sleepabit
      templateScope: local/nested-wf-test
      type: Pod
    nested-wf-test-1033837250:
      boundaryID: nested-wf-test
      children:
      - nested-wf-test-3144780967
      displayName: '[1]'
      finishedAt: null
      id: nested-wf-test-1033837250
      name: nested-wf-test[1]
      phase: Running
      progress: 0/1
      startedAt: "2021-04-21T21:20:27Z"
      templateScope: local/nested-wf-test
      type: StepGroup
    nested-wf-test-1133525750:
      boundaryID: nested-wf-test
      children:
      - nested-wf-test-73877493
      displayName: runtb
      finishedAt: "2021-04-21T21:20:27Z"
      id: nested-wf-test-1133525750
      inputs:
        parameters:
        - name: excluded_node
          value: ""
      message: Error (exit code 1)
      name: nested-wf-test[0].runtb
      phase: Failed
      progress: 1/2
      resourcesDuration:
        cpu: 10
        memory: 10
      startedAt: "2021-04-21T21:20:08Z"
      templateName: sleepabit
      templateScope: local/nested-wf-test
      type: Retry
    nested-wf-test-3144780967:
      boundaryID: nested-wf-test
      children:
      - nested-wf-test-3241021146
      displayName: retryrun
      finishedAt: null
      id: nested-wf-test-3144780967
      inputs:
        parameters:
        - name: excluded_node
          value: '{{workflow.outputs.parameters.nodename}}'
      name: nested-wf-test[1].retryrun
      phase: Running
      progress: 0/1
      startedAt: "2021-04-21T21:20:27Z"
      templateName: sleepabit
      templateScope: local/nested-wf-test
      type: Retry
    nested-wf-test-3241021146:
      boundaryID: nested-wf-test
      displayName: retryrun(0)
      finishedAt: null
      id: nested-wf-test-3241021146
      inputs:
        parameters:
        - name: excluded_node
          value: '{{workflow.outputs.parameters.nodename}}'
      message: 'Unschedulable: 0/72 nodes are available: 72 node(s) didn''t match
        node selector.'
      name: nested-wf-test[1].retryrun(0)
      phase: Pending
      progress: 0/1
      startedAt: "2021-04-21T21:20:27Z"
      templateName: sleepabit
      templateScope: local/nested-wf-test
      type: Pod
    nested-wf-test-4187779359:
      boundaryID: nested-wf-test
      children:
      - nested-wf-test-1133525750
      displayName: '[0]'
      finishedAt: "2021-04-21T21:20:27Z"
      id: nested-wf-test-4187779359
      name: nested-wf-test[0]
      phase: Succeeded
      progress: 1/2
      resourcesDuration:
        cpu: 10
        memory: 10
      startedAt: "2021-04-21T21:20:08Z"
      templateScope: local/nested-wf-test
      type: StepGroup
  phase: Running
  progress: 1/2
  resourcesDuration:
    cpu: 10
    memory: 10
  startedAt: "2021-04-21T21:20:08Z"

I then recompiled argoexec to add some extra prints to see the various variables in use:

kubectl logs nested-wf-test-73877493 -c wait -f
time="2021-04-21T21:20:19.230Z" level=info msg="Starting Workflow Executor" version="{untagged 2021-04-21T21:06:04Z 46221c5c901ce3df1ce3144cf6d54705c1e8eb04 untagged clean go1.15.7 gc linux/amd64}"
I0421 21:20:19.231385       1 merged_client_builder.go:121] Using in-cluster configuration
I0421 21:20:19.231668       1 merged_client_builder.go:163] Using in-cluster namespace
time="2021-04-21T21:20:19.235Z" level=info msg="Creating a docker executor"
time="2021-04-21T21:20:19.235Z" level=info msg="Executor (version: untagged, build_date: 2021-04-21T21:06:04Z) initialized (pod: default/nested-wf-test-73877493) with template:\n{\"name\":\"sleepabit\",\"inputs\":{\"parameters\":[{\"name\":\"excluded_node\",\"value\":\"\"}]},\"outputs\":{\"parameters\":[{\"name\":\"nodename\",\"valueFrom\":{\"path\":\"nodename.txt\"},\"globalName\":\"nodename\"}]},\"affinity\":{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"kubernetes.io/hostname\",\"operator\":\"NotIn\",\"values\":[\"\"]}]}]}}},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"echo $NODE_NAME \\u003e nodename.txt \\u0026\\u0026 echo blablabla \\u0026\\u0026 sleep 5 \\u0026\\u0026 false\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}},\"retryStrategy\":{\"limit\":2,\"retryPolicy\":\"OnError\"}}"
time="2021-04-21T21:20:19.235Z" level=info msg="Starting annotations monitor"
time="2021-04-21T21:20:19.235Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-21T21:20:19.235Z" level=info msg="Starting deadline monitor"
time="2021-04-21T21:20:19.280Z" level=info msg="mapped container name \"main\" to container ID \"7c3b4ec01ec4f5015110f7307aaba01caf68d7a90dd20385ea13a95831c2d530\" (created at 2021-04-21 21:20:19 +0000 UTC, status Created)"
time="2021-04-21T21:20:19.280Z" level=info msg="mapped container name \"wait\" to container ID \"b5004cf8e03affa2199aead9ea5767d22a799490e9d24ce9d5066a3376eabd9b\" (created at 2021-04-21 21:20:19 +0000 UTC, status Up)"
time="2021-04-21T21:20:20.235Z" level=info msg="docker wait 7c3b4ec01ec4f5015110f7307aaba01caf68d7a90dd20385ea13a95831c2d530"
time="2021-04-21T21:20:20.280Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-21T21:20:21.314Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-21T21:20:22.350Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-21T21:20:23.384Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-21T21:20:24.420Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-21T21:20:24.560Z" level=info msg="Main container completed"
time="2021-04-21T21:20:24.560Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-04-21T21:20:24.560Z" level=info msg="Capturing script exit code"
time="2021-04-21T21:20:24.595Z" level=info msg="Saving logs"
time="2021-04-21T21:20:24.595Z" level=info msg="[docker logs 7c3b4ec01ec4f5015110f7307aaba01caf68d7a90dd20385ea13a95831c2d530]"
time="2021-04-21T21:20:24.630Z" level=info msg="art: &Artifact{Name:main-logs,Path:,Mode:nil,From:,ArtifactLocation:ArtifactLocation{ArchiveLogs:nil,S3:nil,Git:nil,HTTP:nil,Artifactory:&ArtifactoryArtifact{URL:/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log,ArtifactoryAuth:ArtifactoryAuth{UsernameSecret:nil,PasswordSecret:nil,},},HDFS:nil,Raw:nil,OSS:nil,GCS:nil,},GlobalName:,Archive:nil,Optional:false,SubPath:,RecurseMode:false,}"
time="2021-04-21T21:20:24.630Z" level=info msg="driverArt: &Artifact{Name:main-logs,Path:,Mode:nil,From:,ArtifactLocation:ArtifactLocation{ArchiveLogs:nil,S3:nil,Git:nil,HTTP:nil,Artifactory:&ArtifactoryArtifact{URL:/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log,ArtifactoryAuth:ArtifactoryAuth{UsernameSecret:nil,PasswordSecret:nil,},},HDFS:nil,Raw:nil,OSS:nil,GCS:nil,},GlobalName:,Archive:nil,Optional:false,SubPath:,RecurseMode:false,}"
time="2021-04-21T21:20:24.630Z" level=info msg="driverArt: &Artifact{Name:main-logs,Path:,Mode:nil,From:,ArtifactLocation:ArtifactLocation{ArchiveLogs:nil,S3:nil,Git:nil,HTTP:nil,Artifactory:&ArtifactoryArtifact{URL:/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log,ArtifactoryAuth:ArtifactoryAuth{UsernameSecret:nil,PasswordSecret:nil,},},HDFS:nil,Raw:nil,OSS:nil,GCS:nil,},GlobalName:,Archive:nil,Optional:false,SubPath:,RecurseMode:false,}"
time="2021-04-21T21:20:24.630Z" level=info msg=NewDriver
time="2021-04-21T21:20:24.630Z" level=info msg=Artifactory
time="2021-04-21T21:20:24.630Z" level=info msg="Artifactory: &ArtifactoryArtifact{URL:/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log,ArtifactoryAuth:ArtifactoryAuth{UsernameSecret:nil,PasswordSecret:nil,},}"
time="2021-04-21T21:20:24.630Z" level=info msg="Alloc=4327 TotalAlloc=9206 Sys=74577 NumGC=4 Goroutines=9"
time="2021-04-21T21:20:24.630Z" level=fatal msg="executor panic: runtime error: invalid memory address or nil pointer dereference\ngoroutine 1 [running]:\nruntime/debug.Stack(0x2053ed2, 0x14, 0xc0006b41c0)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).HandleError(0xc00037b600, 0x23706c0, 0xc000190020)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:126 +0x1d6\npanic(0x1dac5c0, 0x3064ec0)\n\t/usr/local/go/src/runtime/panic.go:975 +0x47a\ngithub.com/argoproj/argo-workflows/v3/workflow/artifacts.NewDriver(0x23706c0, 0xc000190020, 0xc000726240, 0x23324c0, 0xc00037b600, 0xc00071f960, 0xa388f5, 0xc0002330e0, 0x2005e00)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifacts.go:99 +0xa21\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).InitDriver(0xc00037b600, 0x23706c0, 0xc000190020, 0xc000726240, 0xc00071f9f0, 0x1, 0x1, 0xa8)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:586 +0x65\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile(0xc00037b600, 0x23706c0, 0xc000190020, 0xc000726180, 0x20400e7, 0x8, 0xc000734b60, 0x1f, 0x0, 0xc000190020)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:316 +0x1be\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveLogs(0xc00037b600, 0x23706c0, 0xc000190020, 0x0, 0x0, 0xc00071fba0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:542 +0x23d\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer(0x23706c0, 0xc000190020, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:61 +0x61f\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1(0xc00037ab00, 0xc0000a93e0, 0x0, 0x6)\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18 +0x3d\ngithub.com/spf13/cobra.(*Command).execute(0xc00037ab00, 0xc0000a9380, 0x6, 0x6, 0xc00037ab00, 0xc0000a9380)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846 +0x2c2\ngithub.com/spf13/cobra.(*Command).ExecuteC(0xc00037a2c0, 0xc000086778, 0xc00071ff78, 0x4062c5)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x375\ngithub.com/spf13/cobra.(*Command).Execute(...)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main()\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:14 +0x2b\n"

workflow controller configmap

apiVersion: v1
data:
  artifactRepository: |
    # archiveLogs will archive the main container logs as an artifact
    archiveLogs: true
    artifactory:
      repoURL: "http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local"
      usernameSecret:
        name: artifactory-sandbox
        key: username
      passwordSecret:
        name: artifactory-sandbox
        key: password
  executor: |
    imagePullPolicy: Always
    args:
    - --loglevel
    - debug
    - --gloglevel
    - "6"
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ConfigMap","metadata":{"annotations":{},"name":"workflow-controller-configmap","namespace":"argo"}}
  creationTimestamp: "2021-01-27T18:28:36Z"
  name: workflow-controller-configmap
  namespace: argo
  resourceVersion: "150280397"
  selfLink: /api/v1/namespaces/argo/configmaps/workflow-controller-configmap
  uid: 62afdfd0-32bb-4ed5-88bf-f8db88ed6706

and the secret (base64 removed)

apiVersion: v1
data:
  password: xxxxxxxxxxxxxxx
  username: xxxxxxxxxxx
kind: Secret
metadata:
  creationTimestamp: "2021-04-21T16:17:47Z"
  name: artifactory-sandbox
  namespace: default
  resourceVersion: "149896347"
  selfLink: /api/v1/namespaces/default/secrets/artifactory-sandbox
  uid: 806ba8d8-183b-4210-a39f-06eeb2b2b9e5
type: Opaque

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

bmayfi3ld commented 3 years ago

I seem to be running into the same thing. I found it interesting that when I embed the Artifactory config into the workflow I do not have an issue.

alexec commented 3 years ago

@bmayfi3ld can you paste your error please?

alexec commented 3 years ago

your logs do not show v3.0.1 - I'm not sure what they show -

git checkout git rev-list -1 --before="Apr 04 2021" master

alexec commented 3 years ago

ok - I see you recomplied it and that changed the line numbering - so I cannot reference your stack trace to source code.

alexec commented 3 years ago

only so many things this could be:

art.Artifactory.UsernameSecret.

or maybe

art.Artifactory.PasswordSecret

alexec commented 3 years ago

I've made a few attempts to repro in v3.0.2. I've not succeeded. Could you please try v3.0.2 and see if that fixes it ?

heyleke commented 3 years ago

Yes, I will retry with v3.0.2 without extra debug lines so stack trace can be used, can you also share how you can use the stracktrace when you have the binary build yourself. I'm not used to working with golang and these kind of printed stacktraces.

bmayfi3ld commented 3 years ago

@bmayfi3ld can you paste your error please?

My error from v3.0.2

executor panic: runtime error: invalid memory address or nil pointer dereference
goroutine 1 [running]:
runtime/debug.Stack(0x2055193, 0x14, 0xc000546bc0)

/usr/local/go/src/runtime/debug/stack.go:24 +0x9f
github.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).HandleError(0xc0000d3080, 0x2371920, 0xc000056018)

/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:126 +0x1d6
panic(0x1dad740, 0x3064ee0)

/usr/local/go/src/runtime/panic.go:975 +0x47a
github.com/argoproj/argo-workflows/v3/workflow/artifacts.NewDriver(0x2371920, 0xc000056018, 0xc000868000, 0x2333720, 0xc0000d3080, 0xc000868038, 0xc000868000, 0xc00066b738, 0x1bb8ed2)

/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifacts.go:94 +0x91f
github.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).InitDriver(0xc0000d3080, 0x2371920, 0xc000056018, 0xc000868000, 0x0, 0xc00077e550, 0x20627c2, 0x1b)

/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:582 +0x65
github.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile(0xc0000d3080, 0x2371920, 0xc000056018, 0xc00066b958, 0xc00077cbb0, 0xe, 0xc000548570, 0x2a, 0x0, 0x0)

/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:313 +0xc5
github.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifact(0xc0000d3080, 0x2371920, 0xc000056018, 0x203b9e1, 0x4, 0xc00066b958, 0x0, 0x0)

/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:292 +0x31d
github.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveArtifacts(0xc0000d3080, 0x2371920, 0xc000056018, 0x0, 0x0)

/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:270 +0x1b5
github.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer(0x2371920, 0xc000056018, 0x0, 0x0)

/go/src/github.com/argoproj/argo-workflows/cmd/
+0x9ba
github.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1(0xc0000d2000, 0xc00044e580, 0x0, 0x2)

/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18 +0x3d
github.com/spf13/cobra.(*Command).execute(0xc0000d2000, 0xc00044e560, 0x2, 0x2, 0xc0000d2000, 0xc00044e560)

/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc00059f600, 0xc000066778, 0xc000555f78, 0x4062c5)

/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)

/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
main.main()

/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:14 +0x2b
bmayfi3ld commented 3 years ago

I'd agree it is the lines you pointed out @alexec. Looking at the extra debug output that @heyleke added in it looks like UsernameSecret and PasswordSecret might be remaining nil even though no error is thrown from the ri.GetSecret call.

time="2021-04-21T21:20:24.630Z" level=info msg="Artifactory: &ArtifactoryArtifact{URL:/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log,ArtifactoryAuth:ArtifactoryAuth{UsernameSecret:nil,PasswordSecret:nil,},}"

bmayfi3ld commented 3 years ago

It seems like the variables are not getting loaded when the source is the controller configmap or a default configmap


data:
  default-v1: "artifactory:\n  url: https://repo.redacted.org/artifactory/argo-generic-local\n
    \ usernameSecret:\n    name: artifactory-secret\n    key: username\n  passwordSecret:\n
    \   name: artifactory-secret\n    key: password    \n"
kind: ConfigMap
metadata:
  annotations:
    workflows.argoproj.io/default-artifact-repository: default-v1
  creationTimestamp: "2021-04-28T20:43:04Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:default-v1: {}
      f:metadata:
        f:annotations:
          .: {}
          f:workflows.argoproj.io/default-artifact-repository: {}
    manager: OpenAPI-Generator
    operation: Update
    time: "2021-04-28T20:43:04Z"
  name: artifact-repositories
  namespace: argo-wf-infrastructure
  resourceVersion: "69899602"
  uid: 53edd11c-7d3f-4d32-a5c5-f5d4f045cd9f```
alexec commented 3 years ago

Your stack trace confirm what I thought - that UsernameSecret is nil. You CM clearly shows it is set to "artifactory-secret".

I can see from your logs that the "archiveLocation" contains this, so where is it lost?

  "archiveLocation": {
    "archiveLogs": true,
    "artifactory": {
      "url": "http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493",
      "usernameSecret": {
        "name": "artifactory-sandbox",
        "key": "username"
      },
      "passwordSecret": {
        "name": "artifactory-sandbox",
        "key": "password"
      }
    }
  }

Another question - is this artifactory only issue?

alexec commented 3 years ago

My first suspect:

func (a *ArtifactLocation) Relocate(l *ArtifactLocation) error {
    if a.HasLocation() { // <-- does this return true for artifactory?
        return nil
    }
    if l == nil {
        return fmt.Errorf("template artifact location not set")
    }
    key, err := a.GetKey()
    if err != nil {
        return err
    }
    *a = *l.DeepCopy() // <-- does this fail to copy?
    return a.SetKey(key)
}
alexec commented 3 years ago

I've created PR that will push an engineering build of the executor: argoproj/argoexec:dev-5733. This adds logging. Could I please ask you to try it out?

heyleke commented 3 years ago

I've tested with controller 3.0.1 and your argoexec:dev-5733:

time="2021-04-29T18:02:47.630Z" level=info msg="Starting Workflow Executor" version="{untagged 2021-04-29T16:55:49Z 1b478cd0af002d3630d6e0b265d477e3a24444c9 untagged clean go1.15.7 gc linux/amd64}"
time="2021-04-29T18:02:47.634Z" level=info msg="Creating a docker executor"
time="2021-04-29T18:02:47.634Z" level=info msg="Executor (version: untagged, build_date: 2021-04-29T16:55:49Z) initialized (pod: default/nested-wf-test-73877493) with template:\n{\"name\":\"sleepabit\",\"inputs\":{\"parameters\":[{\"name\":\"excluded_node\",\"value\":\"\"}]},\"outputs\":{\"parameters\":[{\"name\":\"nodename\",\"valueFrom\":{\"path\":\"nodename.txt\"},\"globalName\":\"nodename\"}]},\"affinity\":{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"kubernetes.io/hostname\",\"operator\":\"NotIn\",\"values\":[\"\"]}]}]}}},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"echo $NODE_NAME \\u003e nodename.txt \\u0026\\u0026 echo blablabla \\u0026\\u0026 sleep 5 \\u0026\\u0026 false\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}},\"retryStrategy\":{\"limit\":2,\"retryPolicy\":\"OnError\"}}"
time="2021-04-29T18:02:47.634Z" level=info msg="Starting annotations monitor"
time="2021-04-29T18:02:47.634Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-29T18:02:47.634Z" level=info msg="Starting deadline monitor"
time="2021-04-29T18:02:47.674Z" level=info msg="mapped container name \"main\" to container ID \"9318d990f3171723c36c2f8d468d7c5bd9a57b2e2322cd49cf43e275b1dcb7da\" (created at 2021-04-29 18:02:47 +0000 UTC, status Created)"
time="2021-04-29T18:02:47.674Z" level=info msg="mapped container name \"wait\" to container ID \"91884ccd096023321a2cab25396f10d218e7cce0559d9a4a30ab9059d6cf4fc2\" (created at 2021-04-29 18:02:47 +0000 UTC, status Up)"
time="2021-04-29T18:02:48.635Z" level=info msg="docker wait 9318d990f3171723c36c2f8d468d7c5bd9a57b2e2322cd49cf43e275b1dcb7da"
time="2021-04-29T18:02:48.674Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-29T18:02:49.712Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-29T18:02:50.745Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-29T18:02:51.785Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-29T18:02:52.829Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-04-29T18:02:52.965Z" level=info msg="Main container completed"
time="2021-04-29T18:02:52.965Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-04-29T18:02:52.965Z" level=info msg="Capturing script exit code"
time="2021-04-29T18:02:52.998Z" level=info msg="Saving logs"
time="2021-04-29T18:02:52.998Z" level=info msg="[docker logs 9318d990f3171723c36c2f8d468d7c5bd9a57b2e2322cd49cf43e275b1dcb7da]"
time="2021-04-29T18:02:53.031Z" level=info msg="ISSUE #5733 - 4" art="{\"name\":\"main-logs\"}"
time="2021-04-29T18:02:53.031Z" level=info msg="ISSUE #5733 - 3" art="{\"name\":\"main-logs\",\"artifactory\":{\"url\":\"/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log\"}}"
time="2021-04-29T18:02:53.031Z" level=info msg="ISSUE #5733 - 1" art="{\"name\":\"main-logs\",\"artifactory\":{\"url\":\"/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log\"}}"
time="2021-04-29T18:02:53.031Z" level=info msg="ISSUE #5733 - 2" art="{\"name\":\"main-logs\",\"artifactory\":{\"url\":\"/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log\"}}"
time="2021-04-29T18:02:53.031Z" level=info msg="Killing sidecars []"
time="2021-04-29T18:02:53.032Z" level=info msg="Alloc=4981 TotalAlloc=9274 Sys=73041 NumGC=3 Goroutines=9"
time="2021-04-29T18:02:53.032Z" level=fatal msg="executor panic: runtime error: invalid memory address or nil pointer dereference\ngoroutine 1 [running]:\nruntime/debug.Stack(0x20551f0, 0x14, 0xc00069e280)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).HandleError(0xc0001c4dc0, 0x2371be0, 0xc000058018)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:126 +0x1d6\npanic(0x1dad740, 0x3065f00)\n\t/usr/local/go/src/runtime/panic.go:975 +0x47a\ngithub.com/argoproj/argo-workflows/v3/workflow/artifacts.NewDriver(0x2371be0, 0xc000058018, 0xc0001c3680, 0x2333960, 0xc0001c4dc0, 0xc0004ba070, 0x0, 0xc0001c36b8, 0xc0001c3680)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifacts.go:97 +0xb12\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).InitDriver(0xc0001c4dc0, 0x2371be0, 0xc000058018, 0xc0001c3680, 0x0, 0xc000271f80, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:586 +0x16e\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile(0xc0001c4dc0, 0x2371be0, 0xc000058018, 0xc0001c35c0, 0x20413a3, 0x8, 0xc000330b20, 0x1f, 0x0, 0xc000058018)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:316 +0x1be\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveLogs(0xc0001c4dc0, 0x2371be0, 0xc000058018, 0x0, 0x0, 0xc000627ba0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:542 +0x23d\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer(0x2371be0, 0xc000058018, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:61 +0x61f\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1(0xc0001c4000, 0xc0004d41c0, 0x0, 0x2)\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18 +0x3d\ngithub.com/spf13/cobra.(*Command).execute(0xc0001c4000, 0xc0004d4180, 0x2, 0x2, 0xc0001c4000, 0xc0004d4180)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846 +0x2c2\ngithub.com/spf13/cobra.(*Command).ExecuteC(0xc0000d3080, 0xc000068778, 0xc000627f78, 0x4062c5)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x375\ngithub.com/spf13/cobra.(*Command).Execute(...)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main()\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:14 +0x2b\n"
alexec commented 3 years ago

I've updated the branch with a potential bug fix. It should be built by 12:30 Pacific time. Can you please test it out?

bmayfi3ld commented 3 years ago

Was able to test the latest build and the nil pointer error is gone. I am still getting a key error. EDIT: I receive no error if I place the Artifactory config in the workflow.

time="2021-04-29T19:33:28.607Z" level=info msg="Creating a K8sAPI executor"
time="2021-04-29T19:33:28.607Z" level=info msg="Executor (version: untagged, build_date: 2021-04-29T18:28:29Z) initialized (pod: argo-wf-infrastructure/empty-dir-bn98t) with template:\n{\"name\":\"pull-sites\",\"inputs\":{},\"outputs\":{\"artifacts\":[{\"name\":\"site-cache\",\"path\":\"/mnt/outgo_ssl_domains_domain_cache.yaml\"}]},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"infrastructure-docker-virtual.repo.org/go-ssl-domains:1.22\",\"args\":[\"pull\",\"sites\"],\"env\":[],\"resources\":{}\"volumeMounts\":[{\"name\":\"out\",\"mountPath\":\"/mnt/out\"}]},\"volumes\":[{\"name\":\"out\",\"emptyDir\":{}}],\"archiveLocation\":{\"artifactory\"{\"url\":\"empty-dir-bn98t/empty-dir-bn98t\",\"usernameSecret\":{\"name\":\"artifactory-secret\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-secret\",\"key\":\"password\"}}}}"
time="2021-04-29T19:33:28.607Z" level=info msg="Starting annotations monitor"
time="2021-04-29T19:33:28.608Z" level=info msg="Starting deadline monitor"
time="2021-04-29T19:33:28.615Z" level=info msg="Watch pods 200"
time="2021-04-29T19:33:31.829Z" level=info msg="Main container completed"
time="2021-04-29T19:33:31.829Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-04-29T19:33:31.829Z" level=info msg="Capturing script exit code"
time="2021-04-29T19:33:31.829Z" level=info msg="Getting exit code of main"
time="2021-04-29T19:33:31.834Z" level=info msg="Get pods 200"
time="2021-04-29T19:33:31.840Z" level=info msg="No output parameters"
time="2021-04-29T19:33:31.840Z" level=info msg="Saving output artifacts"
time="2021-04-29T19:33:31.841Z" level=info msg="ISSUE #5733 - 6" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-29T19:33:31.841Z" level=info msg="Staging artifact: site-cache"
time="2021-04-29T19:33:31.841Z" level=info msg="Staging /mnt/out/go_ssl_domains_domain_cache.yaml from mirrored volume mount /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-04-29T19:33:31.841Z" level=info msg="Taring /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-04-29T19:33:31.857Z" level=info msg="Successfully staged /mnt/out/go_ssl_domains_domain_cache.yaml from mirrored volume mount /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-04-29T19:33:31.858Z" level=info msg="ISSUE #5733 - 5" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-29T19:33:31.858Z" level=info msg="ISSUE #5733 - 4" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-29T19:33:31.858Z" level=info msg="ISSUE #5733 - 1.1" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-29T19:33:31.858Z" level=error msg="executor error: key unsupported"
time="2021-04-29T19:33:31.861Z" level=info msg="Get pods 200"
time="2021-04-29T19:33:31.861Z" level=info msg="Killing sidecars []"
time="2021-04-29T19:33:31.861Z" level=info msg="Alloc=5209 TotalAlloc=11416 Sys=73297 NumGC=4 Goroutines=9"
time="2021-04-29T19:33:31.863Z" level=fatal msg="key unsupported"
alexec commented 3 years ago

I've built a new version with a fix. Same tag. Set imagePullPolicy: Always to make sure you get the new version. Can you please test?

heyleke commented 3 years ago

new run with imagePullPolicy for the executor to Always:

time="2021-04-30T05:47:04.890Z" level=info msg="Starting Workflow Executor" version="{untagged 2021-04-29T20:51:30Z 98c2b79b996195a5787994667316e98671e34769 untagged clean go1.15.7 gc linux/amd64}"
time="2021-04-30T05:47:04.943Z" level=info msg="Creating a docker executor"
time="2021-04-30T05:47:04.943Z" level=info msg="Executor (version: untagged, build_date: 2021-04-29T20:51:30Z) initialized (pod: default/nested-wf-test-1557477403) with template:\n{\"name\":\"sleepabit\",\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"sleep 10\"],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}}"
time="2021-04-30T05:47:04.943Z" level=info msg="Starting annotations monitor"
time="2021-04-30T05:47:04.943Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:04.944Z" level=info msg="Starting deadline monitor"
time="2021-04-30T05:47:05.080Z" level=info msg="mapped container name \"main\" to container ID \"a2f011030ccccd8a740b3f3f7fa7031e0afdfa4089d65ac576510a9d495f2a94\" (created at 2021-04-30 05:47:04 +0000 UTC, status Up)"
time="2021-04-30T05:47:05.080Z" level=info msg="mapped container name \"wait\" to container ID \"300293e9dd6e9f55f9534ea2a5f82779ec5a510ce57b130cd4711dea322e16d4\" (created at 2021-04-30 05:47:04 +0000 UTC, status Up)"
time="2021-04-30T05:47:05.944Z" level=info msg="docker wait a2f011030ccccd8a740b3f3f7fa7031e0afdfa4089d65ac576510a9d495f2a94"
time="2021-04-30T05:47:06.081Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:07.182Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:08.271Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:09.346Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:10.430Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:11.512Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:12.552Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:13.586Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:14.618Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:15.161Z" level=info msg="Main container completed"
time="2021-04-30T05:47:15.161Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-04-30T05:47:15.161Z" level=info msg="Capturing script exit code"
time="2021-04-30T05:47:15.201Z" level=info msg="Saving logs"
time="2021-04-30T05:47:15.201Z" level=info msg="[docker logs a2f011030ccccd8a740b3f3f7fa7031e0afdfa4089d65ac576510a9d495f2a94]"
time="2021-04-30T05:47:15.244Z" level=info msg="ISSUE #5733 - 3.1" art="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-04-30T05:47:15.245Z" level=info msg="ISSUE #5733 - 4" art="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-04-30T05:47:15.245Z" level=info msg="ISSUE #5733 - 1.4" art="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-04-30T05:47:15.245Z" level=info msg="ISSUE #5733 - 3" art="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-04-30T05:47:15.245Z" level=info msg="ISSUE #5733 - 1" art="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-04-30T05:47:15.245Z" level=info msg="ISSUE #5733 - 2" art="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-1557477403\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-04-30T05:47:15.652Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-1557477403"
time="2021-04-30T05:47:16.473Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2021-04-30T05:47:16.473Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2021-04-30T05:47:16.473Z" level=info msg="No output parameters"
time="2021-04-30T05:47:16.473Z" level=info msg="No output artifacts"
time="2021-04-30T05:47:16.473Z" level=info msg="Annotating pod with output"
time="2021-04-30T05:47:16.486Z" level=info msg="Patch pods 200"
time="2021-04-30T05:47:16.495Z" level=info msg="Killing sidecars []"
time="2021-04-30T05:47:16.495Z" level=info msg="Alloc=6684 TotalAlloc=11067 Sys=73297 NumGC=3 Goroutines=10"

2 files present on artifactory: image

bmayfi3ld commented 3 years ago

I'm back to the panic

time="2021-04-30T13:54:38.682Z" level=info msg="Starting Workflow Executor" version="{untagged 2021-04-29T20:51:30Z 98c2b79b996195a5787994667316e98671e34769 untagged clean go1.15.7 gc linux/amd64}"
time="2021-04-30T13:54:38.690Z" level=info msg="Creating a K8sAPI executor"
time="2021-04-30T13:54:38.690Z" level=info msg="Executor (version: untagged, build_date: 2021-04-29T20:51:30Z) initialized (pod: argo-wf-infrastructure/empty-dir-9x4sm) with template:\n{\"name\":\"pull-sites\",\"inputs\":{},\"outputs\":{\"artifacts\":[{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}]},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"infrastructure-docker-virtual.repo.org/go-ssl-domains:1.2.2\",\"args\":[\"pull\",\"sites\"],\"env\":[]},\"volumes\":[{\"name\":\"out\",\"emptyDir\":{}}],\"archiveLocation\":{\"artifactory\":{\"url\":\"empty-dir-9x4sm/empty-dir-9x4sm\",\"usernameSecret\":{\"name\":\"artifactory-secret\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-secret\",\"key\":\"password\"}}}}"
time="2021-04-30T13:54:38.690Z" level=info msg="Starting annotations monitor"
time="2021-04-30T13:54:38.691Z" level=info msg="Starting deadline monitor"
time="2021-04-30T13:54:38.699Z" level=info msg="Watch pods 200"
time="2021-04-30T13:54:38.712Z" level=info msg="Main container completed"
time="2021-04-30T13:54:38.712Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-04-30T13:54:38.712Z" level=info msg="Capturing script exit code"
time="2021-04-30T13:54:38.712Z" level=info msg="Getting exit code of main"
time="2021-04-30T13:54:38.716Z" level=info msg="Get pods 200"
time="2021-04-30T13:54:38.717Z" level=info msg="No output parameters"
time="2021-04-30T13:54:38.717Z" level=info msg="Saving output artifacts"
time="2021-04-30T13:54:38.718Z" level=info msg="ISSUE #5733 - 6" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-30T13:54:38.718Z" level=info msg="Staging artifact: site-cache"
time="2021-04-30T13:54:38.718Z" level=info msg="Staging /mnt/out/go_ssl_domains_domain_cache.yaml from mirrored volume mount /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-04-30T13:54:38.718Z" level=info msg="Taring /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-04-30T13:54:38.735Z" level=info msg="Successfully staged /mnt/out/go_ssl_domains_domain_cache.yaml from mirrored volume mount /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 5" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 4" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 1.1" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 1.2" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\",\"artifactory\":{\"url\":\"\"}}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 1.3" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\",\"artifactory\":{\"url\":\"empty-dir-9x4sm/empty-dir-9x4sm/site-cache.tgz\"}}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 3" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\",\"artifactory\":{\"url\":\"empty-dir-9x4sm/empty-dir-9x4sm/site-cache.tgz\"}}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 1" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\",\"artifactory\":{\"url\":\"empty-dir-9x4sm/empty-dir-9x4sm/site-cache.tgz\"}}"
time="2021-04-30T13:54:38.735Z" level=info msg="ISSUE #5733 - 2" art="{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\",\"artifactory\":{\"url\":\"empty-dir-9x4sm/empty-dir-9x4sm/site-cache.tgz\"}}"
time="2021-04-30T13:54:38.739Z" level=info msg="Get pods 200"
time="2021-04-30T13:54:38.739Z" level=info msg="Killing sidecars []"
time="2021-04-30T13:54:38.739Z" level=info msg="Alloc=6380 TotalAlloc=11414 Sys=73297 NumGC=3 Goroutines=9"
time="2021-04-30T13:54:38.741Z" level=fatal msg="executor panic: runtime error: invalid memory address or nil pointer dereference\ngoroutine 1 [running]:\nruntime/debug.Stack(0x2056245, 0x14, 0xc0000ec200)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).HandleError(0xc0003f02c0, 0x2372dc0, 0xc000056018)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:126 +0x1d6\npanic(0x1dae740, 0x3066f00)\n\t/usr/local/go/src/runtime/panic.go:975 +0x47a\ngithub.com/argoproj/argo-workflows/v3/workflow/artifacts.NewDriver(0x2372dc0, 0xc000056018, 0xc00069e180, 0x2334b40, 0xc0003f02c0, 0xc000271180, 0x0, 0xc00069e1b8, 0xc00069e180)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifacts.go:97 +0xb12\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).InitDriver(0xc0003f02c0, 0x2372dc0, 0xc000056018, 0xc00069e180, 0x0, 0xc0002710a0, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:594 +0x16e\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile(0xc0003f02c0, 0x2372dc0, 0xc000056018, 0xc00069e000, 0xc000155110, 0xe, 0xc000872450, 0x2a, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:320 +0x685\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifact(0xc0003f02c0, 0x2372dc0, 0xc000056018, 0x203c9e4, 0x4, 0xc00069e000, 0x0, 0x2056795)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:293 +0x485\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveArtifacts(0xc0003f02c0, 0x2372dc0, 0xc000056018, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:270 +0x374\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer(0x2372dc0, 0xc000056018, 0x0, 0x0)\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoe
+0x9ba\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1(0xc000707600, 0xc000453b80, 0x0, 0x2)\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18 +0x3d\ngithub.com/spf13/cobra.(*Command).execute(0xc000707600, 0xc000453b60, 0x2, 0x2, 0xc000707600, 0xc000453b60)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846 +0x2c2\ngithub.com/spf13/cobra.(*Command).ExecuteC(0xc000706dc0, 0xc000066778, 0xc00079ff78, 0x4062c5)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x375\ngithub.com/spf13/cobra.(*Command).Execute(...)\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main()\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:14 +0x2b\n"
bmayfi3ld commented 3 years ago

The key unsupported error I was getting above I think might be unrelated. I'm going to open a new ticket for it. https://github.com/argoproj/argo-workflows/issues/5782

alexec commented 3 years ago

@heyleke I think you're saying your issue is fixed?

heyleke commented 3 years ago

Yes, for me this solves the issue, still interested in being able to use the stack dump myself to debug such issues, any pointer for a newby with golang? 😁

bmayfi3ld commented 3 years ago

@heyleke Ardan labs is always a good resource https://www.ardanlabs.com/blog/2015/01/stack-traces-in-go.html

alexec commented 3 years ago

@heyleke I've created a polished version of my fix. Do you think you test it just in case I've un-fixed it? argoproj/argoexec:dev-5733

heyleke commented 3 years ago

Yes, I've double checked, you latest version is still fine:


time="2021-05-03T06:02:58.975Z" level=info msg="Creating a docker executor"
time="2021-05-03T06:02:58.975Z" level=info msg="Executor (version: untagged, build_date: 2021-05-02T23:26:34Z) initialized (pod: default/nested-wf-test-73877493) with template:\n{\"name\":\"sleepabit\",\"inputs\":{\"parameters\":[{\"name\":\"excluded_node\",\"value\":\"\"}]},\"outputs\":{\"parameters\":[{\"name\":\"nodename\",\"valueFrom\":{\"path\":\"nodename.txt\"},\"globalName\":\"nodename\"}]},\"affinity\":{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"kubernetes.io/hostname\",\"operator\":\"NotIn\",\"values\":[\"\"]}]}]}}},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"echo $NODE_NAME \\u003e nodename.txt \\u0026\\u0026 echo blablabla \\u0026\\u0026 sleep 5 \\u0026\\u0026 false\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}},\"retryStrategy\":{\"limit\":2,\"retryPolicy\":\"OnError\"}}"
time="2021-05-03T06:02:58.975Z" level=info msg="Starting annotations monitor"
time="2021-05-03T06:02:58.976Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-03T06:02:58.976Z" level=info msg="Starting deadline monitor"
time="2021-05-03T06:02:59.063Z" level=info msg="mapped container name \"main\" to container ID \"2fa15447eb03a172e20118564936f8ee34b1081aafe869ac2f0a84b2bf1af442\" (created at 2021-05-03 06:02:58 +0000 UTC, status Up)"
time="2021-05-03T06:02:59.063Z" level=info msg="mapped container name \"wait\" to container ID \"6a131d28ef34555d87a8c2261d284b8971d35a8abe5a4dc9c2a253cb9d2ef698\" (created at 2021-05-03 06:02:58 +0000 UTC, status Up)"
time="2021-05-03T06:02:59.976Z" level=info msg="docker wait 2fa15447eb03a172e20118564936f8ee34b1081aafe869ac2f0a84b2bf1af442"
time="2021-05-03T06:03:00.063Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-03T06:03:01.120Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-03T06:03:02.169Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-03T06:03:03.268Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-03T06:03:04.362Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-03T06:03:04.364Z" level=info msg="Main container completed"
time="2021-05-03T06:03:04.364Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-05-03T06:03:04.364Z" level=info msg="Saving logs"
time="2021-05-03T06:03:04.364Z" level=info msg="[docker logs 2fa15447eb03a172e20118564936f8ee34b1081aafe869ac2f0a84b2bf1af442]"
time="2021-05-03T06:03:05.241Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2021-05-03T06:03:05.241Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2021-05-03T06:03:05.241Z" level=info msg="Saving output parameters"
time="2021-05-03T06:03:05.241Z" level=info msg="Saving path output parameter: nodename"
time="2021-05-03T06:03:05.241Z" level=info msg="Copying nodename.txt from base image layer"
time="2021-05-03T06:03:05.242Z" level=info msg="sh -c docker cp -a 2fa15447eb03a172e20118564936f8ee34b1081aafe869ac2f0a84b2bf1af442:nodename.txt - | tar -ax -O"
time="2021-05-03T06:03:05.440Z" level=info msg="Successfully saved output parameter: nodename"
time="2021-05-03T06:03:05.440Z" level=info msg="No output artifacts"
time="2021-05-03T06:03:05.441Z" level=info msg="Annotating pod with output"
time="2021-05-03T06:03:05.453Z" level=info msg="Patch pods 200"
time="2021-05-03T06:03:05.461Z" level=info msg="Killing sidecars []"
time="2021-05-03T06:03:05.461Z" level=info msg="Alloc=6473 TotalAlloc=11275 Sys=73553 NumGC=3 Goroutines=10"
alexec commented 3 years ago

Can you please try both argoproj/workflow-controller:dev-5733 and argoproj/argoexec:dev-5733

bmayfi3ld commented 3 years ago

I'm now getting unsupported protocol scheme errors.

time="2021-05-03T19:21:02.615Z" level=info msg="Starting Workflow Executor" executorType=pns version=untagged
time="2021-05-03T19:21:02.622Z" level=info msg="Creating PNS executor (namespace: argo-wf-infrastructure, pod: empty-dir-rln8z, pid: 7)"
time="2021-05-03T19:21:02.622Z" level=info msg="Creating a K8sAPI executor"
time="2021-05-03T19:21:02.622Z" level=info msg="Executor (version: untagged, build_date: 2021-05-03T18:10:09Z) initialized (pod: argo-wf-infrastructure/empty-dir-rln8z) with template:\n{\"name\":\"pull-sites\",\"inputs\":{},\"outputs\":{\"artifacts\":[{\"name\":\"site-cache\",\"path\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"}]},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"infrastructure-docker-virtual.repo.org/go-ssl-domains:1.2.2\",\"args\":[\"pull\",\"sites\"],\"env\":[{\"name\":\"GSD_SITEFILENAME\",\"value\":\"/mnt/out/go_ssl_domains_domain_cache.yaml\"},{\"name\":\"GSD_CLIENTSECRET\",\"valueFrom\":{\"secretKeyRef\":{\"name\":\"argo-wf-infrastructure\",\"key\":\"akamai-client-secret\"}}}],\"resources\":{},\"volumeMounts\":[{\"name\":\"out\",\"mountPath\":\"/mnt/out\"}]},\"volumes\":[{\"name\":\"out\",\"emptyDir\":{}}],\"archiveLocation\":{\"artifactory\":{\"url\":\"empty-dir-rln8z/empty-dir-rln8z\",\"usernameSecret\":{\"name\":\"artifactory-secret\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-secret\",\"key\":\"password\"}}}}"
time="2021-05-03T19:21:02.622Z" level=info msg="Starting annotations monitor"
time="2021-05-03T19:21:02.623Z" level=info msg="Starting deadline monitor"
...pid stuff...
time="2021-05-03T19:21:07.632Z" level=info msg="Watch pods 200"
time="2021-05-03T19:21:07.642Z" level=info msg="Main container completed"
time="2021-05-03T19:21:07.642Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-05-03T19:21:07.642Z" level=info msg="No output parameters"
time="2021-05-03T19:21:07.642Z" level=info msg="Saving output artifacts"
time="2021-05-03T19:21:07.642Z" level=info msg="Staging artifact: site-cache"
time="2021-05-03T19:21:07.642Z" level=info msg="Staging /mnt/out/go_ssl_domains_domain_cache.yaml from mirrored volume mount /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-05-03T19:21:07.642Z" level=info msg="Taring /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-05-03T19:21:07.658Z" level=info msg="Successfully staged /mnt/out/go_ssl_domains_domain_cache.yaml from mirrored volume mount /mainctrfs/mnt/out/go_ssl_domains_domain_cache.yaml"
time="2021-05-03T19:21:07.659Z" level=error msg="executor error: Put \"empty-dir-rln8z/empty-dir-rln8z/site-cache.tgz\": unsupported protocol scheme \"\""
time="2021-05-03T19:21:07.659Z" level=info msg="Annotating pod with output"
time="2021-05-03T19:21:07.674Z" level=info msg="Patch pods 200"
time="2021-05-03T19:21:07.675Z" level=info msg="Killing sidecars [\"pid/25\"]"
time="2021-05-03T19:21:07.682Z" level=info msg="Get pods 200"
time="2021-05-03T19:21:07.682Z" level=info msg="Sending SIGTERM to pid 25"
time="2021-05-03T19:21:07.682Z" level=warning msg="Failed to SIGTERM pid 25: os: process already finished"
time="2021-05-03T19:21:08.682Z" level=info msg="PID 25 completed"
time="2021-05-03T19:21:08.683Z" level=info msg="Alloc=7362 TotalAlloc=15762 Sys=73553 NumGC=4 Goroutines=10"
time="2021-05-03T19:21:08.685Z" level=fatal msg="Put \"empty-dir-rln8z/empty-dir-rln8z/site-cache.tgz\": unsupported protocol scheme \"\""
heyleke commented 3 years ago

Can you please try both argoproj/workflow-controller:dev-5733 and argoproj/argoexec:dev-5733

with both I get following error:


time="2021-05-04T07:08:14.448Z" level=info msg="Creating a docker executor"
time="2021-05-04T07:08:14.448Z" level=info msg="Executor (version: untagged, build_date: 2021-05-03T20:53:44Z) initialized (pod: default/nested-wf-test-73877493) with template:\n{\"name\":\"sleepabit\",\"inputs\":{\"parameters\":[{\"name\":\"excluded_node\",\"value\":\"\"}]},\"outputs\":{\"parameters\":[{\"name\":\"nodename\",\"valueFrom\":{\"path\":\"nodename.txt\"},\"globalName\":\"nodename\"}]},\"affinity\":{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"kubernetes.io/hostname\",\"operator\":\"NotIn\",\"values\":[\"\"]}]}]}}},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"echo $NODE_NAME \\u003e nodename.txt \\u0026\\u0026 echo blablabla \\u0026\\u0026 sleep 5 \\u0026\\u0026 false\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}},\"retryStrategy\":{\"limit\":2,\"retryPolicy\":\"OnError\"}}"
time="2021-05-04T07:08:14.448Z" level=info msg="Starting annotations monitor"
time="2021-05-04T07:08:14.448Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T07:08:14.449Z" level=info msg="Starting deadline monitor"
time="2021-05-04T07:08:14.506Z" level=info msg="mapped container name \"main\" to container ID \"b1110d1a7815e58fb7485888aa521f15532a292a796528cbcdbed3dedc812515\" (created at 2021-05-04 07:08:14 +0000 UTC, status Created)"
time="2021-05-04T07:08:14.506Z" level=info msg="mapped container name \"wait\" to container ID \"5890bad96eb0685e47115e86b8620fd7383831c0849ac720a8cda55c24825d3f\" (created at 2021-05-04 07:08:14 +0000 UTC, status Up)"
time="2021-05-04T07:08:15.448Z" level=info msg="docker wait b1110d1a7815e58fb7485888aa521f15532a292a796528cbcdbed3dedc812515"
time="2021-05-04T07:08:15.507Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T07:08:16.537Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T07:08:17.569Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T07:08:18.601Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T07:08:19.634Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T07:08:19.655Z" level=info msg="Main container completed"
time="2021-05-04T07:08:19.655Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-05-04T07:08:19.655Z" level=info msg="Saving logs"
time="2021-05-04T07:08:19.655Z" level=info msg="[docker logs b1110d1a7815e58fb7485888aa521f15532a292a796528cbcdbed3dedc812515]"
time="2021-05-04T07:08:20.326Z" level=error msg="executor error: saving file /tmp/argo/outputs/logs/main.log to artifactory failed with reason:400 Bad Request\ngithub.com/argoproj/argo-workflows/v3/errors.New\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:49\ngithub.com/argoproj/argo-workflows/v3/errors.Errorf\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:55\ngithub.com/argoproj/argo-workflows/v3/errors.InternalErrorf\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:65\ngithub.com/argoproj/argo-workflows/v3/workflow/artifacts/artifactory.(*ArtifactDriver).Save\n\t/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifactory/artifactory.go:74\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:321\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveLogs\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:543\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:52\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:15\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:204\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374"
time="2021-05-04T07:08:20.326Z" level=info msg="Saving output parameters"
time="2021-05-04T07:08:20.326Z" level=info msg="Saving path output parameter: nodename"
time="2021-05-04T07:08:20.326Z" level=info msg="Copying nodename.txt from base image layer"
time="2021-05-04T07:08:20.327Z" level=info msg="sh -c docker cp -a b1110d1a7815e58fb7485888aa521f15532a292a796528cbcdbed3dedc812515:nodename.txt - | tar -ax -O"
time="2021-05-04T07:08:20.479Z" level=info msg="Successfully saved output parameter: nodename"
time="2021-05-04T07:08:20.479Z" level=info msg="No output artifacts"
time="2021-05-04T07:08:20.479Z" level=info msg="Annotating pod with output"
time="2021-05-04T07:08:20.518Z" level=info msg="Patch pods 200"
time="2021-05-04T07:08:20.530Z" level=info msg="Killing sidecars []"
alexec commented 3 years ago

Good. We're making progress. I've just added a line of debugging to the branch. Can you run again please?

heyleke commented 3 years ago

time="2021-05-04T15:10:44.259Z" level=info msg="Creating a docker executor"
time="2021-05-04T15:10:44.259Z" level=info msg="Executor (version: untagged, build_date: 2021-05-04T14:38:47Z) initialized (pod: default/nested-wf-test-73877493) with template:\n{\"name\":\"sleepabit\",\"inputs\":{\"parameters\":[{\"name\":\"excluded_node\",\"value\":\"\"}]},\"outputs\":{\"parameters\":[{\"name\":\"nodename\",\"valueFrom\":{\"path\":\"nodename.txt\"},\"globalName\":\"nodename\"}]},\"affinity\":{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"kubernetes.io/hostname\",\"operator\":\"NotIn\",\"values\":[\"\"]}]}]}}},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"echo $NODE_NAME \\u003e nodename.txt \\u0026\\u0026 echo blablabla \\u0026\\u0026 sleep 5 \\u0026\\u0026 false\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}},\"retryStrategy\":{\"limit\":2,\"retryPolicy\":\"OnError\"}}"
time="2021-05-04T15:10:44.259Z" level=info msg="Starting annotations monitor"
time="2021-05-04T15:10:44.260Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:44.261Z" level=info msg="Starting deadline monitor"
time="2021-05-04T15:10:44.427Z" level=info msg="mapped container name \"main\" to container ID \"c589680a53acc7e60a085232a8a7555c378625913465f02a451ec6fc129e3341\" (created at 2021-05-04 15:10:44 +0000 UTC, status Created)"
time="2021-05-04T15:10:44.427Z" level=info msg="mapped container name \"wait\" to container ID \"3db48eee49bf7722cbcd5af5ca1934be9518d7512d8e75b0f9a7f8550c02ed9b\" (created at 2021-05-04 15:10:43 +0000 UTC, status Up)"
time="2021-05-04T15:10:45.264Z" level=info msg="docker wait c589680a53acc7e60a085232a8a7555c378625913465f02a451ec6fc129e3341"
time="2021-05-04T15:10:45.428Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:46.623Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:47.665Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:48.709Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:49.732Z" level=info msg="Main container completed"
time="2021-05-04T15:10:49.732Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-05-04T15:10:49.732Z" level=info msg="Saving logs"
time="2021-05-04T15:10:49.733Z" level=info msg="[docker logs c589680a53acc7e60a085232a8a7555c378625913465f02a451ec6fc129e3341]"
time="2021-05-04T15:10:49.830Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:49.864Z" level=info msg="Issue #5733: Save" artifact="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-05-04T15:10:50.842Z" level=error msg="executor error: saving file /tmp/argo/outputs/logs/main.log to artifactory failed with reason:400 Bad Request\ngithub.com/argoproj/argo-workflows/v3/errors.New\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:49\ngithub.com/argoproj/argo-workflows/v3/errors.Errorf\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:55\ngithub.com/argoproj/argo-workflows/v3/errors.InternalErrorf\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:65\ngithub.com/argoproj/argo-workflows/v3/workflow/artifacts/artifactory.(*ArtifactDriver).Save\n\t/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifactory/artifactory.go:76\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:321\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveLogs\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:543\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:52\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:15\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:204\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374"
time="2021-05-04T15:10:50.842Z" level=info msg="Saving output parameters"
time="2021-05-04T15:10:50.842Z" level=info msg="Saving path output parameter: nodename"
time="2021-05-04T15:10:50.842Z" level=info msg="Copying nodename.txt from base image layer"
time="2021-05-04T15:10:50.843Z" level=info msg="sh -c docker cp -a c589680a53acc7e60a085232a8a7555c378625913465f02a451ec6fc129e3341:nodename.txt - | tar -ax -O"
time="2021-05-04T15:10:50.955Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:10:51.223Z" level=info msg="Successfully saved output parameter: nodename"
time="2021-05-04T15:10:51.223Z" level=info msg="No output artifacts"
time="2021-05-04T15:10:51.223Z" level=info msg="Annotating pod with output"
time="2021-05-04T15:10:51.243Z" level=info msg="Patch pods 200"
time="2021-05-04T15:10:51.257Z" level=info msg="Killing sidecars []"
time="2021-05-04T15:10:51.257Z" level=info msg="Alloc=6216 TotalAlloc=11367 Sys=73297 NumGC=3 Goroutines=10"
time="2021-05-04T15:10:51.262Z" level=fatal msg="saving file /tmp/argo/outputs/logs/main.log to artifactory failed with reason:400 Bad Request\ngithub.com/argoproj/argo-workflows/v3/errors.New\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:49\ngithub.com/argoproj/argo-workflows/v3/errors.Errorf\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:55\ngithub.com/argoproj/argo-workflows/v3/errors.InternalErrorf\n\t/go/src/github.com/argoproj/argo-workflows/errors/errors.go:65\ngithub.com/argoproj/argo-workflows/v3/workflow/artifacts/artifactory.(*ArtifactDriver).Save\n\t/go/src/github.com/argoproj/argo-workflows/workflow/artifacts/artifactory/artifactory.go:76\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).saveArtifactFromFile\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:321\ngithub.com/argoproj/argo-workflows/v3/workflow/executor.(*WorkflowExecutor).SaveLogs\n\t/go/src/github.com/argoproj/argo-workflows/workflow/executor/executor.go:543\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.waitContainer\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:52\ngithub.com/argoproj/argo-workflows/v3/cmd/argoexec/commands.NewWaitCommand.func1\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/commands/wait.go:18\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\nmain.main\n\t/go/src/github.com/argoproj/argo-workflows/cmd/argoexec/main.go:15\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:204\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374"
heyleke commented 3 years ago

note: the log that is now trying to be put on artfiactory is: http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log

while the only path that already exist is: http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/

where old logs are seen, with the same filename as the directory being created: image

alexec commented 3 years ago

It's a 400 error, not 403/401 so I think auth is correct. But the URL may be wrong.

Does it appear to be inserting nested-wf-test-73877493 into the URL?

heyleke commented 3 years ago

If I remove the old file (from previous runs), it works fine:

time="2021-05-04T15:16:35.522Z" level=info msg="Creating a docker executor"
time="2021-05-04T15:16:35.522Z" level=info msg="Executor (version: untagged, build_date: 2021-05-04T14:38:47Z) initialized (pod: default/nested-wf-test-73877493) with template:\n{\"name\":\"sleepabit\",\"inputs\":{\"parameters\":[{\"name\":\"excluded_node\",\"value\":\"\"}]},\"outputs\":{\"parameters\":[{\"name\":\"nodename\",\"valueFrom\":{\"path\":\"nodename.txt\"},\"globalName\":\"nodename\"}]},\"affinity\":{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"kubernetes.io/hostname\",\"operator\":\"NotIn\",\"values\":[\"\"]}]}]}}},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"sh\",\"-c\"],\"args\":[\"echo $NODE_NAME \\u003e nodename.txt \\u0026\\u0026 echo blablabla \\u0026\\u0026 sleep 5 \\u0026\\u0026 false\"],\"env\":[{\"name\":\"NODE_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"spec.nodeName\"}}}],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}},\"retryStrategy\":{\"limit\":2,\"retryPolicy\":\"OnError\"}}"
time="2021-05-04T15:16:35.522Z" level=info msg="Starting annotations monitor"
time="2021-05-04T15:16:35.523Z" level=info msg="Starting deadline monitor"
time="2021-05-04T15:16:35.523Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:16:35.636Z" level=info msg="mapped container name \"main\" to container ID \"c82f724116a9f120e1e59ba3a9eba0b794dd407ca8605e6d83303631dd893035\" (created at 2021-05-04 15:16:35 +0000 UTC, status Created)"
time="2021-05-04T15:16:35.636Z" level=info msg="mapped container name \"wait\" to container ID \"e14b7fd842de3601aa50c5b17b6642bf1e61c7e678e577e8e0066692fe6dd185\" (created at 2021-05-04 15:16:35 +0000 UTC, status Up)"
time="2021-05-04T15:16:35.636Z" level=info msg="ignoring container \"main\" created at 2021-05-04 15:10:44 +0000 UTC, too long before process started"
time="2021-05-04T15:16:35.636Z" level=info msg="ignoring container \"wait\" created at 2021-05-04 15:10:43 +0000 UTC, too long before process started"
time="2021-05-04T15:16:36.523Z" level=info msg="docker wait c82f724116a9f120e1e59ba3a9eba0b794dd407ca8605e6d83303631dd893035"
time="2021-05-04T15:16:36.637Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:16:36.731Z" level=info msg="ignoring container \"main\" created at 2021-05-04 15:10:44 +0000 UTC, too long before process started"
time="2021-05-04T15:16:36.731Z" level=info msg="ignoring container \"wait\" created at 2021-05-04 15:10:43 +0000 UTC, too long before process started"
time="2021-05-04T15:16:37.731Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:16:37.893Z" level=info msg="ignoring container \"main\" created at 2021-05-04 15:10:44 +0000 UTC, too long before process started"
time="2021-05-04T15:16:37.893Z" level=info msg="ignoring container \"wait\" created at 2021-05-04 15:10:43 +0000 UTC, too long before process started"
time="2021-05-04T15:16:38.893Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:16:39.023Z" level=info msg="ignoring container \"main\" created at 2021-05-04 15:10:44 +0000 UTC, too long before process started"
time="2021-05-04T15:16:39.023Z" level=info msg="ignoring container \"wait\" created at 2021-05-04 15:10:43 +0000 UTC, too long before process started"
time="2021-05-04T15:16:40.023Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:16:40.080Z" level=info msg="ignoring container \"main\" created at 2021-05-04 15:10:44 +0000 UTC, too long before process started"
time="2021-05-04T15:16:40.080Z" level=info msg="ignoring container \"wait\" created at 2021-05-04 15:10:43 +0000 UTC, too long before process started"
time="2021-05-04T15:16:40.900Z" level=info msg="Main container completed"
time="2021-05-04T15:16:40.901Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-05-04T15:16:40.901Z" level=info msg="Saving logs"
time="2021-05-04T15:16:40.901Z" level=info msg="[docker logs c82f724116a9f120e1e59ba3a9eba0b794dd407ca8605e6d83303631dd893035]"
time="2021-05-04T15:16:40.980Z" level=info msg="Issue #5733: Save" artifact="{\"name\":\"main-logs\",\"archiveLogs\":true,\"artifactory\":{\"url\":\"http://artifactory-espoo1.int.net.nokia.com/artifactory/fixedaccess-sw-rpm-local/nested-wf-test/nested-wf-test-73877493/main.log\",\"usernameSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"username\"},\"passwordSecret\":{\"name\":\"artifactory-sandbox\",\"key\":\"password\"}}}"
time="2021-05-04T15:16:41.012Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2021-05-04T15:16:41.012Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2021-05-04T15:16:41.012Z" level=info msg="Saving output parameters"
time="2021-05-04T15:16:41.012Z" level=info msg="Saving path output parameter: nodename"
time="2021-05-04T15:16:41.012Z" level=info msg="Copying nodename.txt from base image layer"
time="2021-05-04T15:16:41.012Z" level=info msg="sh -c docker cp -a c82f724116a9f120e1e59ba3a9eba0b794dd407ca8605e6d83303631dd893035:nodename.txt - | tar -ax -O"
time="2021-05-04T15:16:41.080Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=default --filter=label=io.kubernetes.pod.name=nested-wf-test-73877493"
time="2021-05-04T15:16:41.138Z" level=info msg="ignoring container \"main\" created at 2021-05-04 15:10:44 +0000 UTC, too long before process started"
time="2021-05-04T15:16:41.138Z" level=info msg="ignoring container \"wait\" created at 2021-05-04 15:10:43 +0000 UTC, too long before process started"
time="2021-05-04T15:16:41.226Z" level=info msg="Successfully saved output parameter: nodename"
time="2021-05-04T15:16:41.226Z" level=info msg="No output artifacts"
time="2021-05-04T15:16:41.226Z" level=info msg="Annotating pod with output"
time="2021-05-04T15:16:41.283Z" level=info msg="Patch pods 200"
time="2021-05-04T15:16:41.298Z" level=info msg="Killing sidecars []"
time="2021-05-04T15:16:41.298Z" level=info msg="Alloc=6155 TotalAlloc=11334 Sys=73553 NumGC=3 Goroutines=10"

so this collision in filename seems the reason for the 400

alexec commented 3 years ago

I'm not clear - you re-ran with the same workflow name and see we tried to upload the same file?

heyleke commented 3 years ago

The original dev-5733 image put logs in the root of that folder, while with your latest version, logs are put as 'main.log' (name of container) in a subfolder with the workflow name, that was the same name as the log that was there already and so gave a collision

alexec commented 3 years ago

Thank you @heyleke - I'm not clear - is this is a bug?

heyleke commented 3 years ago

No, just a side effect of the different implementations I guess

alexec commented 3 years ago

So it’s fixed?

heyleke commented 3 years ago

yes