argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.11k stars 3.21k forks source link

Template of the same name are not validated #13763

Open tczhao opened 1 month ago

tczhao commented 1 month ago

Pre-requisites

What happened? What did you expect to happen?

When submitting workflow, templates are validated and put in status:storedTemplates:

However, there exists a bug that if templates from different templateScope are used but with the same templateName, the later same-name template is not validated.

Our ultimate goal is to have all template validated and put in storeTemplates, so that if a workflow is running, any changes to existing template won't affect the running workflow.

To reproduce the issue load the example to k8s. then run

./dist/argo submit --from workflowtemplate/workflow-template-dag  --dry-run -o yaml

The following are present in storedTemplates

    cluster/workflow-template-print-message/print-message:
    cluster/workflow-template-print-message1/print-message:
    namespaced/workflow-template-dag/hello:

The following are missing in storedTemplates

cluster/workflow-template-print-message1/print-message11:
cluster/workflow-template-print-message1/print-message12:

if we replace template name cluster/workflow-template-print-message1/print-message with cluster/workflow-template-print-message1/print-message1 then all templates appear in storedTemplate

Version(s)

latest

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
  name: workflow-template-print-message1
spec:
  templates:
  - name: print-message
    dag:
      tasks:
      - name: print-message11
        template: print-message11
      - name: print-message12
        template: print-message12
        depends: "print-message11.Failed || print-message11.Errored"
  - name: print-message11
    container:
      image: busybox
      command: [sleep]
      args: ["15"]
  - name: print-message12
    container:
      image: busybox
      command: [sleep]
      args: ["15"]
---
apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
  name: workflow-template-print-message
spec:
  templates:
  - name: print-message
    dag:
      tasks:
      - name: hello1
        templateRef:
          name: workflow-template-print-message1
          template: print-message
          clusterScope: true
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: workflow-template-dag
spec:
  entrypoint: hello
  templates:
  - name: hello
    retryStrategy:
      limit: 10
      retryPolicy: "Always"
    inputs:
      parameters:
        - name: foo
          value: foo
    dag:
      tasks:
      - name: hello1
        templateRef:
          name: workflow-template-print-message
          template: print-message
          clusterScope: true

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
tczhao commented 1 month ago

The bug is caused by line https://github.com/argoproj/argo-workflows/blob/v3.6.0-rc2/workflow/validate/validate.go#L477-L482

Joibel commented 1 month ago

Highlighting my key points on this: