kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.59k stars 1.62k forks source link

[frontend] Error on Kubeflow Pipelines Dashboard: 'No Healthy Upstream', '503 UC upstream_reset_before_response_started{connection_termination}' #11260

Open ChaeSeungJi opened 3 days ago

ChaeSeungJi commented 3 days ago

Environment

How did you deploy Kubeflow Pipelines (KFP)?

KFP version: 1.19.1

Kubernetes version image

Steps to reproduce

Expected result

image

image

Materials and Reference

/server/node_modules/node-fetch/lib/index.js:1491 reject(new FetchError(request to ${request.url} failed, reason: ${err.message}, 'system', err)); ^ FetchError: request to http://metadata/computeMetadata/v1/project/project-id failed, reason: getaddrinfo ENOTFOUND metadata at ClientRequest. (/server/node_modules/node-fetch/lib/index.js:1491:11) at ClientRequest.emit (node:events:517:28) at Socket.socketErrorListener (node:_http_client:501:9) at Socket.emit (node:events:517:28) at emitErrorNT (node:internal/streams/destroy:151:8) at emitErrorCloseNT (node:internal/streams/destroy:116:3) at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { type: 'system', errno: 'ENOTFOUND', code: 'ENOTFOUND' }

Node.js v18.18.2


- ml-pipeline-ui(istio-proxy) logs

[2024-10-01T10:36:08.021Z] "GET /pipeline/static/css/main.e10b3034.css HTTP/1.1" 200 - via_upstream - "-" 0 14493 5 4 "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "58461348-0b7c-4a85-8fcf-72e0e2c78072" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:44173 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.015Z] "GET /pipeline/static/js/main.b980985e.js HTTP/1.1" 200 - via_upstream - "-" 0 4120994 26 8 "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "85419e0a-4f50-96c3-b4e9-8060a636d949" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:44035 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.327Z] "GET /apis/v1beta1/healthz HTTP/1.1" 200 - via_upstream - "-" 0 96 2 2 "-" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" "6cbd9bfd-af63-478d-8ddb-b252884570b1" "10.106.183.67:8888" "192.168.52.177:8888" outbound|8888||ml-pipeline.kubeflow.svc.cluster.local 192.168.217.73:37664 10.106.183.67:8888 192.168.217.73:55642 - default [2024-10-01T10:36:08.312Z] "GET /pipeline/apis/v1beta1/healthz HTTP/1.1" 200 - via_upstream - "-" 0 291 17 17 "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "95641d97-c788-4b82-80ff-dec317fb7d8a" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:44035 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.327Z] "GET /pipeline/apis/v2beta1/pipelines?page_token=&page_size=10&sort_by=created_at%20desc&filter= HTTP/1.1" 200 - via_upstream - "-" 0 883 6 6 "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "fa1cc37a-6504-4044-be57-f4fc7283abba" "10.106.183.67:8888" "192.168.52.177:8888" outbound|8888||ml-pipeline.kubeflow.svc.cluster.local 192.168.217.73:47786 10.106.183.67:8888 192.168.64.246:0 - default [2024-10-01T10:36:08.313Z] "GET /pipeline/apis/v2beta1/pipelines?page_token=&page_size=10&sort_by=created_at%20desc&filter= HTTP/1.1" 200 - via_upstream - "-" 0 883 21 20 "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "fa1cc37a-6504-4044-be57-f4fc7283abba" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:44173 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.363Z] "GET /pipeline/apis/v2beta1/pipelines?namespace=kubeflow-user-example-com&page_token=&page_size=10&sort_by=created_at%20desc&filter= HTTP/1.1" 200 - via_upstream - "-" 0 2 9 9 "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "1469974e-9638-4c8c-b485-a25948020125" "10.106.183.67:8888" "192.168.52.177:8888" outbound|8888||ml-pipeline.kubeflow.svc.cluster.local 192.168.217.73:47786 10.106.183.67:8888 192.168.64.246:0 - default [2024-10-01T10:36:08.312Z] "GET /pipeline/system/project-id HTTP/1.1" 503 UC upstream_reset_before_response_started{connection_termination} - "-" 0 95 69 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "5215f428-d320-4cf6-b07f-5156818e6cd6" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:45291 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.313Z] "GET /pipeline/system/cluster-name HTTP/1.1" 503 UC upstream_reset_before_response_started{connection_termination} - "-" 0 95 68 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "551d34d9-3a65-456a-92a2-18331db6e9c2" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:49221 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.360Z] "GET /pipeline/apis/v2beta1/pipelines?namespace=kubeflow-user-example-com&page_token=&page_size=10&sort_by=created_at%20desc&filter= HTTP/1.1" 503 UC upstream_reset_before_response_started{connection_termination} - "-" 0 95 21 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "1469974e-9638-4c8c-b485-a25948020125" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| 127.0.0.6:44035 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.384Z] "GET /pipeline/system/project-id HTTP/1.1" 503 UF upstream_reset_before_response_started{remote_connection_failure,delayed_connect_error:_111} - "delayed_connect_error:_111" 0 152 0 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "5215f428-d320-4cf6-b07f-5156818e6cd6" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| - 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.390Z] "GET /pipeline/system/project-id HTTP/1.1" 503 UF upstream_reset_before_response_started{remote_connection_failure,delayed_connect_error:_111} - "delayed_connect_error:_111" 0 152 0 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "5215f428-d320-4cf6-b07f-5156818e6cd6" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| - 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.395Z] "GET /pipeline/system/cluster-name HTTP/1.1" 503 UF upstream_reset_before_response_started{remote_connection_failure,delayed_connect_error:_111} - "delayed_connect_error:_111" 0 152 0 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "551d34d9-3a65-456a-92a2-18331db6e9c2" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| - 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.402Z] "GET /pipeline/apis/v2beta1/pipelines?namespace=kubeflow-user-example-com&page_token=&page_size=10&sort_by=created_at%20desc&filter= HTTP/1.1" 503 UF upstream_reset_before_response_started{remote_connection_failure,delayed_connect_error:_111} - "delayed_connect_error:_111" 0 152 0 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "1469974e-9638-4c8c-b485-a25948020125" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| - 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.433Z] "GET /pipeline/system/cluster-name HTTP/1.1" 503 UF upstream_reset_before_response_started{remote_connection_failure,delayed_connect_error:_111} - "delayed_connect_error:_111" 0 152 0 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "551d34d9-3a65-456a-92a2-18331db6e9c2" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| - 192.168.217.73:3000 192.168.64.246:0 - default [2024-10-01T10:36:08.441Z] "GET /pipeline/apis/v2beta1/pipelines?namespace=kubeflow-user-example-com&page_token=&page_size=10&sort_by=created_at%20desc&filter= HTTP/1.1" 503 UF upstream_reset_before_response_started{remote_connection_failure,delayed_connect_error:_111} - "delayed_connect_error:_111" 0 152 0 - "192.168.64.246" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" "1469974e-9638-4c8c-b485-a25948020125" "inl.test:32692" "192.168.217.73:3000" inbound|3000|| - 192.168.217.73:3000 192.168.64.246:0 - default



<!-- Help us debug this issue by providing resources such as: sample code, background context, or links to references. -->

---

### What I've Tried
https://github.com/kubeflow/kubeflow/issues/5561
https://github.com/kubeflow/pipelines/issues/4469
https://github.com/kubeflow/kubeflow/issues/5271

<!-- Don't delete message below to encourage users to support your issue! -->
Impacted by this bug? Give it a 👍. 
rimolive commented 1 day ago

@ChaeSeungJi We have just released https://github.com/kubeflow/manifests/releases/tag/v1.9.1-rc.2. Can you take a look and see if this release fixes your issue?