tektoncd / pipeline

A cloud-native Pipeline resource.
https://tekton.dev
Apache License 2.0
8.43k stars 1.77k forks source link

nodeSelector value not propagated to child pipelines in pipeline-in-pipeline feature #7274

Open gigatesseract opened 10 months ago

gigatesseract commented 10 months ago

Expected Behavior

When running a nested pipeline configuration, the node selector value is not passed to child pipelines. This results in nodeSelector being None.

Actual Behavior

Node selector to be passed to child pipelines.

Steps to Reproduce the Problem

  1. Ensure pipeline-in-pipeline is enabled. (feature flag set to True and pod running in tekton-pipelines namespace)
  2. Apply the following YAMLs in the same namespace
  3. Create a custom node selector
    apiVersion: tekton.dev/v1beta1
    kind: Task
    metadata:
    name: hello-world
    spec:
    steps:
    - name: echo
      image: alpine
      script: |
        #!/bin/sh
        echo "Hello World"
    ---
    apiVersion: tekton.dev/v1beta1
    kind: Pipeline
    metadata:
    name: pipeline-in-pipeline-test
    spec:
    tasks:
    - name: custom-task-inside-pipeline-in-pipeline
      taskRef:
        kind: Task
        name: hello-world
      timeout: 48h10m0s
    ---
    apiVersion: tekton.dev/v1beta1
    kind: Pipeline
    metadata:
    name: test-node-selector
    spec:
    tasks:
    - name: spawned-directly
      taskRef:
        kind: Task
        name: hello-world
    - name: indirect-pipeline-in-pipeline-test
      taskRef:
        apiVersion: tekton.dev/v1beta1
        kind: Pipeline
        name: pipeline-in-pipeline-test
      timeout: 24h0m0s
  4. Create the run yaml
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: test-node-selector-run-
spec:
  pipelineRef:
    name: test-node-selector
  podTemplate:
    nodeSelector:
      kubernetes.io/role: <CUSTOM_NODE_SELECTOR>
  timeout: 24h0m0s
  1. Describe the pod created for the task which was created for the nested pipeline. We get something like
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s

Additional Info

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.11", GitCommit:"8cfcba0b15c343a8dc48567a74c29ec4844e0b9e", GitTreeState:"clean", BuildDate:"2023-06-14T09:49:38Z", GoVersion:"go1.19.10", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.25) exceeds the supported minor version skew of +/-1```

- Tekton Pipeline version:

  **Output of `tkn version` or `kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'`**

(paste your output here)


I dont have access to list pods in `tekton-pipelines` namespace, we have the dashboard UI and the version from that is
`v0.47.2`

<!-- Any other additional information -->
Can confirm that this issue is not present in `v0.35.1` (We have clusters running tekton pipelines of both these versions and issue is only in 0.47.2)
gigatesseract commented 10 months ago

The issue seems to be here: https://github.com/tektoncd/experimental/tree/main/pipelines-in-pipelines/pkg/reconciler/pip/piprun.go in function ReconcileKind, where check if the CustomRun has started, and if not, start the run.

However, I expect the spec to be copied as is from parent pipeline run to the custom run. I am new to go code, can you perhaps point me to the place in pipelines-in-pipelines where we copy the spec of the pipelinerun run as is to the custom run?

sallyyama commented 9 months ago

I opened the PR to address this issue. https://github.com/tektoncd/experimental/pull/966