Closed Boes-man closed 7 months ago
Hello, I am trying to run this kf example pipeline (Build a more advanced ML pipeline). I modified it: removed endpoint and added compiler. I can now upload the pipeline and run it, but it is not completing. I have found some error message related to RBAC and "files not found" but I am not sure if its related or how to fix them. Thanks
iris-train-pl.py.txt iris-train-pl.yaml.txt
`Error message from pods
} │
│ I0208 00:43:52.024138 20 main.go:118] input ContainerSpec:{ │
│ "args": [ │
│ "--executor_input", │
│ "{{$}}", │
│ "--function_to_execute", │
│ "train_model" │
│ ], │
│ "command": [ │
│ "sh", │
│ "-c", │
│ "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip i │
│ "sh", │
│ "-ec", │
│ "program_path=$(mktemp -d)\n\nprintf \"%s\" \"$0\" \u003e \"$program_path/ephemeral_component.py\"\n_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component │
│ "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import \nfrom typing import \n\ndef train_model(\n normalized_iris_dataset: Input[Dataset],\n model: Output[Model],\n n_neighb │
│ ], │
│ "image": "python:3.7" │
│ } │
│ I0208 00:43:52.024455 20 cache.go:139] Cannot detect ml-pipeline in the same namespace, default to ml-pipeline.kubeflow:8887 as KFP endpoint. │
│ I0208 00:43:52.024469 20 cache.go:116] Connecting to cache endpoint ml-pipeline.kubeflow:8887 │
│ I0208 00:43:52.057353 20 client.go:251] Pipeline Context: id:18 name:"iris-training-pipeline" type_id:11 create_time_since_epoch:1707347981672 last_update_time_since_epoch:1707347981672 │
│ I0208 00:43:52.096094 20 client.go:259] Pipeline Run Context: id:22 name:"7142d5a6-0b61-4e1f-9c63-ac17d7c2ae67" type_id:12 custom_properties:{key:"namespace" value:{string_value:"kubefl │
│ I0208 00:43:52.257403 20 driver.go:241] parent DAG: id:75 type_id:13 last_known_state:RUNNING custom_properties:{key:"display_name" value:{string_value:"for-loop-1"}} custom_properties: │
│ I0208 00:43:52.258490 20 driver.go:771] parent DAG input parameters map[pipelinechannel--neighbors-loop-item:number_value:3] │
│ F0208 00:43:52.258574 20 main.go:76] KFP driver: driver.Container(pipelineName=iris-training-pipeline, runID=7142d5a6-0b61-4e1f-9c63-ac17d7c2ae67, task="train-model", component="comp-tr │
│ time="2024-02-08T00:43:52.960Z" level=info msg="sub-process exited" argo=true error="
List of images (no sure how to finf kf version? 44 [docker.io/istio/proxyv2:1.17.5](http://docker.io/istio/proxyv2:1.17.5) 1 [docker.io/kubeflowkatib/katib-controller:v0.16.0-rc.1](http://docker.io/kubeflowkatib/katib-controller:v0.16.0-rc.1) 1 [docker.io/kubeflowkatib/katib-db-manager:v0.16.0-rc.1](http://docker.io/kubeflowkatib/katib-db-manager:v0.16.0-rc.1) 1 [docker.io/kubeflowkatib/katib-ui:v0.16.0-rc.1](http://docker.io/kubeflowkatib/katib-ui:v0.16.0-rc.1) 1 [docker.io/kubeflownotebookswg/centraldashboard:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/centraldashboard:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/jupyter-web-app:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/jupyter-web-app:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/kfam:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/kfam:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/notebook-controller:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/notebook-controller:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/poddefaults-webhook:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/poddefaults-webhook:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/profile-controller:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/profile-controller:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/pvcviewer-controller:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/pvcviewer-controller:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/tensorboard-controller:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/tensorboard-controller:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/tensorboards-web-app:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/tensorboards-web-app:v1.8.0-rc.0) 1 [docker.io/kubeflownotebookswg/volumes-web-app:v1.8.0-rc.0](http://docker.io/kubeflownotebookswg/volumes-web-app:v1.8.0-rc.0) 1 [docker.io/metacontrollerio/metacontroller:v2.0.4](http://docker.io/metacontrollerio/metacontroller:v2.0.4) 2 [gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1](http://gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1) 1 [gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0](http://gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0) 1 [gcr.io/ml-pipeline/api-server:2.0.1](http://gcr.io/ml-pipeline/api-server:2.0.1) 1 [gcr.io/ml-pipeline/cache-server:2.0.1](http://gcr.io/ml-pipeline/cache-server:2.0.1) 1 [gcr.io/ml-pipeline/frontend:2.0.1](http://gcr.io/ml-pipeline/frontend:2.0.1) 1 [gcr.io/ml-pipeline/metadata-envoy:2.0.1](http://gcr.io/ml-pipeline/metadata-envoy:2.0.1) 1 [gcr.io/ml-pipeline/metadata-writer:2.0.1](http://gcr.io/ml-pipeline/metadata-writer:2.0.1) 1 [gcr.io/ml-pipeline/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance](http://gcr.io/ml-pipeline/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance) 1 [gcr.io/ml-pipeline/mysql:8.0.26](http://gcr.io/ml-pipeline/mysql:8.0.26) 1 [gcr.io/ml-pipeline/persistenceagent:2.0.1](http://gcr.io/ml-pipeline/persistenceagent:2.0.1) 1 [gcr.io/ml-pipeline/scheduledworkflow:2.0.1](http://gcr.io/ml-pipeline/scheduledworkflow:2.0.1) 1 [gcr.io/ml-pipeline/viewer-crd-controller:2.0.1](http://gcr.io/ml-pipeline/viewer-crd-controller:2.0.1) 1 [gcr.io/ml-pipeline/visualization-server:2.0.1](http://gcr.io/ml-pipeline/visualization-server:2.0.1) 1 [gcr.io/ml-pipeline/workflow-controller:v3.3.10-license-compliance](http://gcr.io/ml-pipeline/workflow-controller:v3.3.10-license-compliance) 1 [gcr.io/tfx-oss-public/ml_metadata_store_server:1.5.0](http://gcr.io/tfx-oss-public/ml_metadata_store_server:1.5.0) 1 kserve/kserve-controller:v0.11.0 1 kserve/models-web-app:v0.10.0 1 kubeflow/training-operator:v1-855e096 1 mysql:8.0.29 1 python:3.7
@juliusvonkohout
What is the KFP version you deployed?
Hi @rimolive, thanks for checking in. I am not sure, its a followon question i have :) I just git cloned the main branch and then used the manifest example install process. In the UI it show "dev local" (dont have a cluster up now, but its something like that).
I did dump out List of images
as per the last part of my original post. Hope that helps.
Thanks
I recommend you to follow the installation documentation in https://www.kubeflow.org/docs/components/pipelines/v2/installation/quickstart/. Applying manifests from main branch is just for dev purposes and not recommended for production/testing.
Is there anything else you need about this issue?
Hi rimolive. think we can close for now. I have not been able to look at this further. Thanks
Sure, no worries!
/close
@rimolive: Closing this issue.
I dont have permssions to move the issue here