kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.63k stars 1.64k forks source link

[bug] i have issue with ml-pipeline pods, for my kubeflow 1.9.0 and latest i am using tag 1.9.0, please see why ml-pipelines pods are not up & running getting these logs #11090

Open ravi-gundavaram opened 3 months ago

ravi-gundavaram commented 3 months ago

ubuntu@ip-10-11-1-60:/eio-mlops-dev/manifests$ kubectl logs ml-pipeline-9cd449fd6-gvhtr -n kubeflow I0811 18:25:14.884098 7 client_manager.go:170] Initializing client manager I0811 18:25:14.884220 7 config.go:57] Config DBConfig.MySQLConfig.ExtraParams not specified, skipping [mysql] 2024/08/11 18:25:14 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:14 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:14 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:15 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:15 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:15 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:16 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:16 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:16 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:17 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:17 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:17 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:18 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:18 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:18 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:21 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:21 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:21 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:27 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:27 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:27 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:34 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:34 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:34 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:43 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:43 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:43 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:57 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:57 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:25:57 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:26:22 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:26:22 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:26:22 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:26:37 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:26:37 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:26:37 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:27:24 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:27:24 packets.go:37: unexpected EOF [mysql] 2024/08/11 18:27:24 packets.go:37: unexpected EOF ubuntu@ip-10-11-1-60:/eio-mlops-dev/manifests$ kubectl get pods -n kubeflow NAME READY STATUS RESTARTS AGE admission-webhook-deployment-6dfbf7c8c6-pvmdd 1/1 Running 0 36h cache-server-58f5d8c7d5-gmmc6 2/2 Running 169 (9m35s ago) 36h centraldashboard-d6f49bb67-7tr82 2/2 Running 0 36h jupyter-web-app-deployment-bc78b48f8-ksh4m 2/2 Running 0 36h katib-controller-754877f9f-62hj8 1/1 Running 0 36h katib-db-manager-78cf5894bd-fgtls 1/1 Running 0 31h katib-mysql-8fc5ccdd4-jr44b 1/1 Running 0 32h katib-ui-858f447bfb-zxd5q 2/2 Running 0 36h kserve-controller-manager-b96c76496-gkjmn 2/2 Running 0 36h kserve-models-web-app-5d7d5857df-fcncr 2/2 Running 0 36h kubeflow-pipelines-profile-controller-7795c68cfd-mk64r 1/1 Running 0 36h metacontroller-0 1/1 Running 0 36h metadata-db-7fd57b5949-5pxrf 1/1 Running 0 30h metadata-envoy-deployment-74d7589fbd-frfmf 1/1 Running 0 31h metadata-grpc-deployment-c8c7f488c-f7f6f 2/2 Running 1 (31h ago) 31h metadata-writer-747b476588-tnjfx 2/2 Running 0 30h minio-64c6fd469d-xrmcb 2/2 Running 0 102m ml-pipeline-9cd449fd6-gvhtr 1/2 CrashLoopBackOff 13 (50s ago) 49m ml-pipeline-persistenceagent-5679776bbf-gt79l 2/2 Running 0 36h ml-pipeline-scheduledworkflow-b6658c7c-b8tk7 2/2 Running 0 36h ml-pipeline-ui-577d569cc8-kgzwp 2/2 Running 0 36h ml-pipeline-viewer-crd-56d7584db6-f5wfs 2/2 Running 1 (36h ago) 36h ml-pipeline-visualizationserver-5f57d6b9cf-jkhcs 2/2 Running 0 36h model-registry-db-7d88d94586-pfjqp 1/1 Running 0 32h model-registry-deployment-7fc8896c9c-4k4mt 3/3 Running 0 31h mysql-577cfcd74f-pzw97 2/2 Running 0 102m notebook-controller-deployment-5d664bc767-5t5rw 2/2 Running 1 (36h ago) 36h profiles-deployment-5f69674595-5pmvh 3/3 Running 1 (36h ago) 36h pvcviewer-controller-manager-5b8487cb6c-vq8hc 3/3 Running 0 36h tensorboard-controller-deployment-59d6c5dc44-szrcx 3/3 Running 1 (36h ago) 36h tensorboards-web-app-deployment-5585954f8f-cjjqf 2/2 Running 0 36h training-operator-78f4df6758-pgt7b 1/1 Running 0 36h volumes-web-app-deployment-756c7fb9c7-7nzdc 2/2 Running 0 36h workflow-controller-7c9c86b578-757db 2/2 Running 1 (36h ago) 36h

Environment

Steps to reproduce

Expected result

Materials and reference

Labels


Impacted by this bug? Give it a 👍.

ravi-gundavaram commented 3 months ago

Hi @vanpelt @dakl @neerfri Any help please on above issue i am blocked and my job is pending due to this

rimolive commented 3 months ago

@ravi-gundavaram ml-pipeline-9cd449fd6-gvhtr pod is in CrashLoopbackOff state. What do you see in the logs?

ravi-gundavaram commented 3 months ago

Hi @ricardo

Can we connect once in meeting teams or gmeet ? I want kubeflow UI admin central dashboard

On Thu, 15 Aug 2024 at 12:39 AM, Ricardo Martinelli de Oliveira < @.***> wrote:

@ravi-gundavaram https://github.com/ravi-gundavaram ml-pipeline-9cd449fd6-gvhtr pod is in CrashLoopbackOff state. What do you see in the logs?

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/pipelines/issues/11090#issuecomment-2289633362, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEEA3DYTBTC5UVUSY6R3UE3ZROTOXAVCNFSM6AAAAABMK7B2OWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBZGYZTGMZWGI . You are receiving this because you were mentioned.Message ID: @.***>

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

tppalani commented 1 month ago

HI @ravi-gundavaram

will you able to resolve the issue if yes please share the RCA so others can use it.