Closed blee-gl closed 4 months ago
I am facing the same issue with the kubernetes.set_image_pull_policy and kfp.kubernetes.set_image_pull_secrets Environment KFP version: 2.0.5 KFP SDK version: 2.7.0 All dependencies version: kfp 2.7.0 kfp-kubernetes 1.2.0 kfp-pipeline-spec 0.3.0 kfp-server-api 2.0.3
components:
comp-simple-task:
executorLabel: exec-simple-task
deploymentSpec:
executors:
exec-simple-task:
container:
args:
- --executor_input
- '{{$}}'
- --function_to_execute
- simple_task
command:
- sh
- -c
- "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip ||\
\ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
\ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0'\
\ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' && \"\
$0\" \"$@\"\n"
- sh
- -ec
- 'program_path=$(mktemp -d)
printf "%s" "$0" > "$program_path/ephemeral_component.py"
_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component_module_path "$program_path/ephemeral_component.py" "$@"
'
- "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
\ *\n\ndef simple_task():\n print(\"hello-world\")\n\n"
image: python:3.7
pipelineInfo:
name: pipeline
root:
dag:
tasks:
simple-task:
cachingOptions:
enableCache: true
componentRef:
name: comp-simple-task
taskInfo:
name: simple-task
schemaVersion: 2.1.0
sdkVersion: kfp-2.7.0
---
platforms:
kubernetes:
deploymentSpec:
executors:
exec-simple-task:
imagePullPolicy: Always
### Pod error
time="2024-06-06T07:30:23.403Z" level=info msg="capturing logs" argo=true I0606 07:30:23.474031 19 main.go:105] input ComponentSpec:{ "executorLabel": "exec-simple-task" } I0606 07:30:23.474488 19 main.go:112] input TaskSpec:{ "cachingOptions": { "enableCache": true }, "componentRef": { "name": "comp-simple-task" }, "taskInfo": { "name": "simple-task" } } I0606 07:30:23.474637 19 main.go:118] input ContainerSpec:{ "args": [ "--executor_input", "{{$}}", "--function_to_execute", "simple_task" ], "command": [ "sh", "-c", "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0' '--no-deps' 'typing-extensions\u003e=3.7.4,\u003c5; python_version\u003c\"3.9\"' \u0026\u0026 \"$0\" \"$@\"\n", "sh", "-ec", "program_path=$(mktemp -d)\n\nprintf \"%s\" \"$0\" \u003e \"$program_path/ephemeral_component.py\"\n_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component_module_path \"$program_path/ephemeral_component.py\" \"$@\"\n", "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import *\n\ndef simple_task():\n print(\"hello-world\")\n\n" ], "image": "python:3.7" } I0606 07:30:23.474760 19 main.go:133] input kubernetesConfig:{ "imagePullPolicy": "Always" } F0606 07:30:23.474840 19 main.go:76] KFP driver: failed to unmarshal Kubernetes config, error: unknown field "imagePullPolicy" in kfp_kubernetes.KubernetesExecutorConfig KubernetesConfig: 0xc000457210 time="2024-06-06T07:30:24.406Z" level=info msg="sub-process exited" argo=true error="<nil>" time="2024-06-06T07:30:24.407Z" level=error msg="cannot save parameter /tmp/outputs/pod-spec-patch" argo=true error="open /tmp/outputs/pod-spec-patch: no such file or directory" time="2024-06-06T07:30:24.407Z" level=error msg="cannot save parameter /tmp/outputs/cached-decision" argo=true error="open /tmp/outputs/cached-decision: no such file or directory" time="2024-06-06T07:30:24.407Z" level=error msg="cannot save parameter /tmp/outputs/condition" argo=true error="open /tmp/outputs/condition: no such file or directory" Error: exit status 1
I've spoke with @rimolive on the #kubeflow-pipelines channel on CNCF Slack. The conclusion was a compatibility issue with the kfp-kubernetes package version and Kubeflow Pipelines version.
kfp-kubernetes v1.2.0 introduces the kfp.kubernetes.add_labels()
, kfp.kubernetes.add_annotations()
, kubernetes.set_image_pull_policy()
, kfp.kubernetes.set_image_pull_secrets()
, alongside other functionality, and is released as part of the KFP v2.2.0 release.
I'm running Kubeflow v1.8 with KFP v2.0.5 meaning that my version of KFP is not compatible with pipeline specs generated using kfp-kubernetes.
The solution is to upgrade KFP to v2.2.0. Kubeflow 1.9 is planned for release in July 2024 and is planned to come with KFP v2.2.0.
Marking this issue as closed.
CC @diankasileymane.
Environment
Steps to reproduce
Take the hello world v2 pipelines example script and use the
kubernetes.add_pod_label
to add a label to the pod created for thehello_world
step as defined in the kfp-kubernetes documentation; code shown below. Then compile the pipeline.The pipeline spec YAML generated:
Upload the pipeline and execute a pipeline run which results in a failure with an error stating "Resource failed to execute":
Expected result
The pipeline should be able to successfully execute and the
hello-world
task pod should have the label "test-label" with value "test-value" attached to the pod.Materials and Reference
Looking into the failed pod logs, this error is given:
F0605 12:20:44.976828 20 main.go:76] KFP driver: failed to unmarshal Kubernetes config, error: unknown field "podMetadata" in kfp_kubernetes.KubernetesExecutorConfig KubernetesConfig: 0xc0002fb3f0
Full log stack below:
Note that both
kubernetes.add_labels
andkubernetes.add_pod_annotation
use thepodMetadata
field which means using either will result in the error above.Impacted by this bug? Give it a 👍.