GoogleCloudPlatform / kubeflow-distribution

Blueprints for Deploying Kubeflow on Google Cloud Platform and Anthos
Apache License 2.0
78 stars 63 forks source link

ASM upgrade fails #379

Closed pablofiumara closed 2 years ago

pablofiumara commented 2 years ago

Environment: GCP Kubeflow version: 1.5.0

Starting from 1.5.0 files (https://github.com/kubeflow/gcp-blueprints/tree/v1.5.0), I updated them using the following two PRs:

https://github.com/kubeflow/gcp-blueprints/pull/369 https://github.com/kubeflow/gcp-blueprints/pull/374

The only change I did was to set desired ASM version to 1.11.8-asm.4+config1 because of the following email

image

I got istio-1.14.3-asm.0, a newer version and error 502. This error seems to be different than what happened here: https://github.com/kubeflow/gcp-blueprints/issues/371

If I install Kubeflow on GCP 1.5.1, I get the same error but version defined in asm Makefile (asm-1132-5)

What should I try? Thanks in advance

gkcalat commented 2 years ago

Hi @pablofiumara!

Thank you for reaching out. Could you please try deploying from master branch on a new GKE cluster? To do so, please follow these instructions, but use master branch:

git clone https://github.com/kubeflow/gcp-blueprints.git 
cd gcp-blueprints
git checkout master   # instead of tags/v1.5.1 -b v1.5.1

If the error persists, please check the troubleshooting guide and share your setup of the load balancer. In addition, please share your deployment logs (printed in the terminal during deployment) and list your workloads and services on the cluster.

pablofiumara commented 2 years ago

Hi @gkcalat !

Thanks for your answer.

Deploying from master branch on a new GKE cluster works. However, my main goal is to upgrade an existing Kubeflow v1.5 cluster.

If I clone master branch, fill in with the details of my existing cluster and execute make apply, I get the following error

_resource mapping not found for name: "myClusterName-kfp" namespace: "myProjectId" from "build/sql.cnrm.cloud.google.com_v1beta1_sqlinstancemyClusterName-kfp.yaml": no matches for kind "SQLInstance" in version "sql.cnrm.cloud.google.com/v1beta1"

Below is my setup. What should I do? Thanks in advance

NAMESPACE NAME READY STATUS RESTARTS AGE asm-system pod/canonical-service-controller-manager-7f7888d9b6-9s7zl 2/2 Running 0 18h cert-manager pod/cert-manager-76b7c557d5-hlzl9 1/1 Running 1 20h cert-manager pod/cert-manager-cainjector-655d695d74-89wbc 1/1 Running 5 20h cert-manager pod/cert-manager-webhook-7955b9bb97-jhgb4 1/1 Running 0 20h gke-connect pod/gke-connect-agent-20220812-00-00-74cbccfdb5-twfbd 1/1 Running 0 18h istio-system pod/backend-updater-0 1/1 Running 0 6m2s istio-system pod/iap-enabler-79c5d47bd5-t89qx 1/1 Running 0 6m3s istio-system pod/istio-ingressgateway-7c7dc89796-dg75z 1/1 Running 0 18h istio-system pod/istio-ingressgateway-7c7dc89796-wgq2j 1/1 Running 0 18h istio-system pod/istiod-asm-1104-6-7bd95dfb49-6f4ff 1/1 Running 0 20h istio-system pod/istiod-asm-1104-6-7bd95dfb49-gp9jw 1/1 Running 1 20h istio-system pod/istiod-asm-1143-0-fc4484f6-6tk7q 1/1 Running 0 18h istio-system pod/istiod-asm-1143-0-fc4484f6-8tfxd 1/1 Running 0 18h istio-system pod/whoami-app-845c7fdc4-wkqjz 1/1 Running 0 20h knative-serving pod/activator-7f8dbbddcc-c5gz9 2/2 Running 0 3m42s knative-serving pod/autoscaler-857c95cc87-gkghm 2/2 Running 0 3m42s knative-serving pod/controller-55b99cfccb-hqxgw 2/2 Running 0 3m41s knative-serving pod/domain-mapping-7574b467-x486z 2/2 Running 0 3m41s knative-serving pod/domainmapping-webhook-596987656d-wpztr 2/2 Running 0 3m40s knative-serving pod/istio-webhook-5f876d5c85-ksfs5 2/2 Running 0 20h knative-serving pod/net-istio-controller-86b67bc8-t2l47 1/1 Running 0 3m40s knative-serving pod/net-istio-webhook-65fb676674-jxcfn 2/2 Running 0 3m39s knative-serving pod/networking-istio-6bbc6b9664-fnmfk 1/1 Running 0 20h knative-serving pod/webhook-667899979f-4cf25 2/2 Running 0 3m39s kube-system pod/event-exporter-gke-5479fd58c8-7bdft 2/2 Running 0 20h kube-system pod/fluentbit-gke-g55sw 2/2 Running 0 20h kube-system pod/fluentbit-gke-l7s5v 2/2 Running 0 20h kube-system pod/gke-metadata-server-d9qnj 1/1 Running 0 20h kube-system pod/gke-metadata-server-nh4sd 1/1 Running 0 20h kube-system pod/gke-metrics-agent-nb586 1/1 Running 0 20h kube-system pod/gke-metrics-agent-wx8lw 1/1 Running 0 20h kube-system pod/konnectivity-agent-5894655fb4-5srbk 1/1 Running 0 20h kube-system pod/konnectivity-agent-5894655fb4-nnkvv 1/1 Running 0 20h kube-system pod/konnectivity-agent-autoscaler-6b86f667c9-hjlvn 1/1 Running 0 20h kube-system pod/kube-dns-697dc8fc8b-m6w8p 4/4 Running 0 20h kube-system pod/kube-dns-697dc8fc8b-n4bql 4/4 Running 0 20h kube-system pod/kube-dns-autoscaler-844c9d9448-s5d8j 1/1 Running 0 20h kube-system pod/kube-proxy-gke-testkube15-default-pool-3c94fb57-0gc6 1/1 Running 0 20h kube-system pod/kube-proxy-gke-testkube15-default-pool-3c94fb57-hqz3 1/1 Running 0 20h kube-system pod/l7-default-backend-69fb9fd9f9-2r2mb 1/1 Running 0 20h kube-system pod/metrics-server-v0.4.5-fb4c49dd6-nbltw 2/2 Running 0 20h kube-system pod/netd-d4qqs 1/1 Running 0 20h kube-system pod/netd-jr9d5 1/1 Running 0 20h kube-system pod/pdcsi-node-mrqmp 2/2 Running 0 20h kube-system pod/pdcsi-node-ssfvm 2/2 Running 0 20h kubeflow pod/admission-webhook-deployment-74bbdfd87f-n282j 1/1 Running 0 40m kubeflow pod/cache-deployer-deployment-54d977b449-blxhf 2/2 Running 2 39m kubeflow pod/cache-server-787cf54ddf-hbgkw 2/2 Running 0 40m kubeflow pod/centraldashboard-7449b76fc7-7gkjq 2/2 Running 0 40m kubeflow pod/cloud-endpoints-controller-665f44d5d6-zn9n7 1/1 Running 0 40m kubeflow pod/cloudsqlproxy-66b6f98fc-89nsf 2/2 Running 1 40m kubeflow pod/jupyter-web-app-deployment-555bff79c-hn5tz 1/1 Running 0 40m kubeflow pod/katib-controller-887877bfb-hlzdn 1/1 Running 0 40m kubeflow pod/katib-db-manager-59bb8589f5-sdzmv 1/1 Running 0 40m kubeflow pod/katib-mysql-6677d78f6-xc29l 1/1 Running 0 39m kubeflow pod/katib-ui-7767f98488-f4q27 1/1 Running 0 40m kubeflow pod/kserve-controller-manager-0 2/2 Running 1 20h kubeflow pod/kserve-models-web-app-598655747f-tx6pr 2/2 Running 0 40m kubeflow pod/kubeflow-pipelines-profile-controller-75c4ffb445-gmfl8 1/1 Running 0 40m kubeflow pod/metacontroller-0 1/1 Running 0 20h kubeflow pod/metadata-envoy-deployment-55cf867444-5t9cq 1/1 Running 0 40m kubeflow pod/metadata-grpc-deployment-65bfdcf5df-kfzkn 2/2 Running 2 40m kubeflow pod/metadata-writer-86fc7c484-rslwq 2/2 Running 0 40m kubeflow pod/minio-6fc74885d7-slkgd 2/2 Running 0 39m kubeflow pod/ml-pipeline-5c97d959b4-9vwq6 2/2 Running 2 40m kubeflow pod/ml-pipeline-persistenceagent-54fbb6797c-5pcjh 2/2 Running 0 40m kubeflow pod/ml-pipeline-scheduledworkflow-7f87d84b4b-rxsrw 2/2 Running 0 40m kubeflow pod/ml-pipeline-ui-67db999c46-8lmsl 2/2 Running 0 40m kubeflow pod/ml-pipeline-viewer-crd-75c677c87c-j872t 2/2 Running 1 40m kubeflow pod/ml-pipeline-visualizationserver-8c9cf74d6-7t6fn 2/2 Running 0 40m kubeflow pod/notebook-controller-deployment-54fc8cb45c-kwkvz 2/2 Running 1 40m kubeflow pod/profiles-deployment-594cdc9679-8bmpq 3/3 Running 2 40m kubeflow pod/tensorboard-controller-controller-manager-d65bb67b7-l9g6w 3/3 Running 2 40m kubeflow pod/tensorboards-web-app-deployment-c5bfbfbdb-jsm8f 1/1 Running 0 40m kubeflow pod/training-operator-66d78b99c8-fdlfn 1/1 Running 0 40m kubeflow pod/volumes-web-app-deployment-f44fbd8cf-gbkrs 1/1 Running 0 40m kubeflow pod/workflow-controller-66b6447f66-7gwtj 2/2 Running 2 40m main pod/conditional-execution-pipeline-with-exit-handler-nggq2-2341636377 0/2 Completed 0 54m main pod/conditional-execution-pipeline-with-exit-handler-nggq2-2411149408 0/2 Completed 0 54m main pod/conditional-execution-pipeline-with-exit-handler-nggq2-3648953784 0/2 Completed 0 55m main pod/conditional-execution-pipeline-with-exit-handler-nggq2-3746918827 0/2 Completed 0 54m main pod/conditional-execution-pipeline-with-exit-handler-nggq2-456702869 0/2 Error 0 54m main pod/ml-pipeline-ui-artifact-76c487b7c5-xk95v 2/2 Running 0 49m main pod/ml-pipeline-visualizationserver-755f46cd7c-p8ksf 2/2 Running 0 49m

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE asm-system service/canonical-service-controller-manager-metrics-service ClusterIP 10.76.13.148 8443/TCP 20h cert-manager service/cert-manager ClusterIP 10.76.4.50 9402/TCP 20h cert-manager service/cert-manager-webhook ClusterIP 10.76.4.168 443/TCP 20h default service/kubernetes ClusterIP 10.76.0.1 443/TCP 20h gke-connect service/gke-connect-monitoring ClusterIP 10.76.8.89 8080/TCP 18h istio-system service/istio-ingressgateway NodePort 10.76.12.252 15021:31223/TCP,80:31224/TCP,443:30620/TCP,15012:32691/TCP,15443:31407/TCP 20h istio-system service/istiod ClusterIP 10.76.12.173 15010/TCP,15012/TCP,443/TCP,15014/TCP 20h istio-system service/istiod-asm-1104-6 ClusterIP 10.76.1.239 15010/TCP,15012/TCP,443/TCP,15014/TCP 20h istio-system service/istiod-asm-1143-0 ClusterIP 10.76.7.55 15010/TCP,15012/TCP,443/TCP,15014/TCP 18h istio-system service/knative-local-gateway ClusterIP 10.76.7.88 80/TCP 20h istio-system service/whoami-app ClusterIP 10.76.8.114 80/TCP 20h knative-serving service/activator-service ClusterIP 10.76.9.157 9090/TCP,8008/TCP,80/TCP,81/TCP 20h knative-serving service/autoscaler ClusterIP 10.76.7.186 9090/TCP,8008/TCP,8080/TCP 20h knative-serving service/autoscaler-bucket-00-of-01 ClusterIP 10.76.6.63 8080/TCP 20h knative-serving service/controller ClusterIP 10.76.2.1 9090/TCP,8008/TCP 20h knative-serving service/domainmapping-webhook ClusterIP 10.76.1.161 9090/TCP,8008/TCP,443/TCP 3m27s knative-serving service/istio-webhook ClusterIP 10.76.6.243 9090/TCP,8008/TCP,443/TCP 20h knative-serving service/net-istio-webhook ClusterIP 10.76.2.213 9090/TCP,8008/TCP,443/TCP 3m26s knative-serving service/webhook ClusterIP 10.76.11.222 9090/TCP,8008/TCP,443/TCP 20h kube-system service/default-http-backend NodePort 10.76.12.81 80:31168/TCP 20h kube-system service/kube-dns ClusterIP 10.76.0.10 53/UDP,53/TCP 20h kube-system service/metrics-server ClusterIP 10.76.8.124 443/TCP 20h kubeflow service/admission-webhook-service ClusterIP 10.76.8.126 443/TCP 20h kubeflow service/cache-server ClusterIP 10.76.9.185 443/TCP 20h kubeflow service/centraldashboard ClusterIP 10.76.9.200 80/TCP 20h kubeflow service/cloud-endpoints-controller ClusterIP 10.76.8.27 80/TCP 20h kubeflow service/jupyter-web-app-service ClusterIP 10.76.1.34 80/TCP 20h kubeflow service/katib-controller ClusterIP 10.76.7.106 443/TCP,8080/TCP 20h kubeflow service/katib-db-manager ClusterIP 10.76.0.53 6789/TCP 20h kubeflow service/katib-mysql ClusterIP 10.76.7.4 3306/TCP 20h kubeflow service/katib-ui ClusterIP 10.76.6.207 80/TCP 20h kubeflow service/kserve-controller-manager-metrics-service ClusterIP 10.76.13.234 8443/TCP 20h kubeflow service/kserve-controller-manager-service ClusterIP 10.76.5.226 443/TCP 20h kubeflow service/kserve-models-web-app ClusterIP 10.76.2.140 80/TCP 20h kubeflow service/kserve-webhook-server-service ClusterIP 10.76.9.168 443/TCP 20h kubeflow service/kubeflow-pipelines-profile-controller ClusterIP 10.76.4.17 80/TCP 20h kubeflow service/metadata-envoy-service ClusterIP 10.76.10.108 9090/TCP 20h kubeflow service/metadata-grpc-service ClusterIP 10.76.6.233 8080/TCP 20h kubeflow service/minio-service ClusterIP 10.76.15.21 9000/TCP 20h kubeflow service/ml-pipeline ClusterIP 10.76.13.140 8888/TCP,8887/TCP 20h kubeflow service/ml-pipeline-ui ClusterIP 10.76.10.183 80/TCP 20h kubeflow service/ml-pipeline-visualizationserver ClusterIP 10.76.0.199 8888/TCP 20h kubeflow service/mysql ClusterIP 10.76.5.144 3306/TCP 20h kubeflow service/notebook-controller-service ClusterIP 10.76.0.66 443/TCP 20h kubeflow service/profiles-kfam ClusterIP 10.76.0.95 8081/TCP 20h kubeflow service/tensorboard-controller-controller-manager-metrics-service ClusterIP 10.76.5.151 8443/TCP 20h kubeflow service/tensorboards-web-app-service ClusterIP 10.76.6.143 80/TCP 20h kubeflow service/training-operator ClusterIP 10.76.6.122 8080/TCP 20h kubeflow service/volumes-web-app-service ClusterIP 10.76.13.74 80/TCP 20h kubeflow service/workflow-controller-metrics ClusterIP 10.76.9.170 9090/TCP 20h main service/ml-pipeline-ui-artifact ClusterIP 10.76.7.144 80/TCP 19h main service/ml-pipeline-visualizationserver ClusterIP 10.76.0.170 8888/TCP 19h

NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-system daemonset.apps/fluentbit-gke 2 2 2 2 2 kubernetes.io/os=linux 20h kube-system daemonset.apps/gke-metadata-server 2 2 2 2 2 beta.kubernetes.io/os=linux,iam.gke.io/gke-metadata-server-enabled=true 20h kube-system daemonset.apps/gke-metrics-agent 2 2 2 2 2 20h kube-system daemonset.apps/gke-metrics-agent-scaling-10 0 0 0 0 0 20h kube-system daemonset.apps/gke-metrics-agent-scaling-20 0 0 0 0 0 20h kube-system daemonset.apps/gke-metrics-agent-windows 0 0 0 0 0 kubernetes.io/os=windows 20h kube-system daemonset.apps/kube-proxy 0 0 0 0 0 kubernetes.io/os=linux,node.kubernetes.io/kube-proxy-ds-ready=true 20h kube-system daemonset.apps/metadata-proxy-v0.1 0 0 0 0 0 cloud.google.com/metadata-proxy-ready=true,kubernetes.io/os=linux 20h kube-system daemonset.apps/netd 2 2 2 2 2 cloud.google.com/gke-netd-ready=true,kubernetes.io/os=linux 20h kube-system daemonset.apps/nvidia-gpu-device-plugin 0 0 0 0 0 20h kube-system daemonset.apps/pdcsi-node 2 2 2 2 2 kubernetes.io/os=linux 20h kube-system daemonset.apps/pdcsi-node-windows 0 0 0 0 0 kubernetes.io/os=windows 20h

NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE asm-system deployment.apps/canonical-service-controller-manager 1/1 1 1 20h cert-manager deployment.apps/cert-manager 1/1 1 1 20h cert-manager deployment.apps/cert-manager-cainjector 1/1 1 1 20h cert-manager deployment.apps/cert-manager-webhook 1/1 1 1 20h gke-connect deployment.apps/gke-connect-agent-20220812-00-00 1/1 1 1 18h istio-system deployment.apps/iap-enabler 1/1 1 1 6m4s istio-system deployment.apps/istio-ingressgateway 2/2 2 2 20h istio-system deployment.apps/istiod-asm-1104-6 2/2 2 2 20h istio-system deployment.apps/istiod-asm-1143-0 2/2 2 2 18h istio-system deployment.apps/whoami-app 1/1 1 1 20h knative-serving deployment.apps/activator 1/1 1 1 20h knative-serving deployment.apps/autoscaler 1/1 1 1 20h knative-serving deployment.apps/controller 1/1 1 1 20h knative-serving deployment.apps/domain-mapping 1/1 1 1 3m42s knative-serving deployment.apps/domainmapping-webhook 1/1 1 1 3m41s knative-serving deployment.apps/istio-webhook 1/1 1 1 20h knative-serving deployment.apps/net-istio-controller 1/1 1 1 3m41s knative-serving deployment.apps/net-istio-webhook 1/1 1 1 3m40s knative-serving deployment.apps/networking-istio 1/1 1 1 20h knative-serving deployment.apps/webhook 1/1 1 1 20h kube-system deployment.apps/event-exporter-gke 1/1 1 1 20h kube-system deployment.apps/konnectivity-agent 2/2 2 2 20h kube-system deployment.apps/konnectivity-agent-autoscaler 1/1 1 1 20h kube-system deployment.apps/kube-dns 2/2 2 2 20h kube-system deployment.apps/kube-dns-autoscaler 1/1 1 1 20h kube-system deployment.apps/l7-default-backend 1/1 1 1 20h kube-system deployment.apps/metrics-server-v0.4.5 1/1 1 1 20h kubeflow deployment.apps/admission-webhook-deployment 1/1 1 1 20h kubeflow deployment.apps/cache-deployer-deployment 1/1 1 1 20h kubeflow deployment.apps/cache-server 1/1 1 1 20h kubeflow deployment.apps/centraldashboard 1/1 1 1 20h kubeflow deployment.apps/cloud-endpoints-controller 1/1 1 1 20h kubeflow deployment.apps/cloudsqlproxy 1/1 1 1 20h kubeflow deployment.apps/jupyter-web-app-deployment 1/1 1 1 20h kubeflow deployment.apps/katib-controller 1/1 1 1 20h kubeflow deployment.apps/katib-db-manager 1/1 1 1 20h kubeflow deployment.apps/katib-mysql 1/1 1 1 20h kubeflow deployment.apps/katib-ui 1/1 1 1 20h kubeflow deployment.apps/kserve-models-web-app 1/1 1 1 20h kubeflow deployment.apps/kubeflow-pipelines-profile-controller 1/1 1 1 20h kubeflow deployment.apps/metadata-envoy-deployment 1/1 1 1 20h kubeflow deployment.apps/metadata-grpc-deployment 1/1 1 1 20h kubeflow deployment.apps/metadata-writer 1/1 1 1 20h kubeflow deployment.apps/minio 1/1 1 1 20h kubeflow deployment.apps/ml-pipeline 1/1 1 1 20h kubeflow deployment.apps/ml-pipeline-persistenceagent 1/1 1 1 20h kubeflow deployment.apps/ml-pipeline-scheduledworkflow 1/1 1 1 20h kubeflow deployment.apps/ml-pipeline-ui 1/1 1 1 20h kubeflow deployment.apps/ml-pipeline-viewer-crd 1/1 1 1 20h kubeflow deployment.apps/ml-pipeline-visualizationserver 1/1 1 1 20h kubeflow deployment.apps/notebook-controller-deployment 1/1 1 1 20h kubeflow deployment.apps/profiles-deployment 1/1 1 1 20h kubeflow deployment.apps/tensorboard-controller-controller-manager 1/1 1 1 20h kubeflow deployment.apps/tensorboards-web-app-deployment 1/1 1 1 20h kubeflow deployment.apps/training-operator 1/1 1 1 20h kubeflow deployment.apps/volumes-web-app-deployment 1/1 1 1 20h kubeflow deployment.apps/workflow-controller 1/1 1 1 20h main deployment.apps/ml-pipeline-ui-artifact 1/1 1 1 19h main deployment.apps/ml-pipeline-visualizationserver 1/1 1 1 19h

NAMESPACE NAME DESIRED CURRENT READY AGE asm-system replicaset.apps/canonical-service-controller-manager-67c8f5fff5 0 0 0 20h asm-system replicaset.apps/canonical-service-controller-manager-7f7888d9b6 1 1 1 18h cert-manager replicaset.apps/cert-manager-76b7c557d5 1 1 1 20h cert-manager replicaset.apps/cert-manager-cainjector-655d695d74 1 1 1 20h cert-manager replicaset.apps/cert-manager-webhook-7955b9bb97 1 1 1 20h gke-connect replicaset.apps/gke-connect-agent-20220812-00-00-74cbccfdb5 1 1 1 18h istio-system replicaset.apps/iap-enabler-79c5d47bd5 1 1 1 6m4s istio-system replicaset.apps/istio-ingressgateway-7c7dc89796 2 2 2 18h istio-system replicaset.apps/istio-ingressgateway-c975c7f47 0 0 0 20h istio-system replicaset.apps/istiod-asm-1104-6-7bd95dfb49 2 2 2 20h istio-system replicaset.apps/istiod-asm-1143-0-fc4484f6 2 2 2 18h istio-system replicaset.apps/whoami-app-845c7fdc4 1 1 1 20h knative-serving replicaset.apps/activator-5754c5ff55 0 0 0 20h knative-serving replicaset.apps/activator-7f8dbbddcc 1 1 1 3m43s knative-serving replicaset.apps/autoscaler-58fc8d57d5 0 0 0 20h knative-serving replicaset.apps/autoscaler-857c95cc87 1 1 1 3m43s knative-serving replicaset.apps/controller-55b99cfccb 1 1 1 3m42s knative-serving replicaset.apps/controller-7bf7955dbf 0 0 0 20h knative-serving replicaset.apps/domain-mapping-7574b467 1 1 1 3m42s knative-serving replicaset.apps/domainmapping-webhook-596987656d 1 1 1 3m41s knative-serving replicaset.apps/istio-webhook-5f876d5c85 1 1 1 20h knative-serving replicaset.apps/net-istio-controller-86b67bc8 1 1 1 3m41s knative-serving replicaset.apps/net-istio-webhook-65fb676674 1 1 1 3m40s knative-serving replicaset.apps/networking-istio-6bbc6b9664 1 1 1 20h knative-serving replicaset.apps/webhook-667899979f 1 1 1 3m40s knative-serving replicaset.apps/webhook-6946b99875 0 0 0 20h kube-system replicaset.apps/event-exporter-gke-5479fd58c8 1 1 1 20h kube-system replicaset.apps/konnectivity-agent-5894655fb4 2 2 2 20h kube-system replicaset.apps/konnectivity-agent-autoscaler-6b86f667c9 1 1 1 20h kube-system replicaset.apps/kube-dns-697dc8fc8b 2 2 2 20h kube-system replicaset.apps/kube-dns-autoscaler-844c9d9448 1 1 1 20h kube-system replicaset.apps/l7-default-backend-69fb9fd9f9 1 1 1 20h kube-system replicaset.apps/metrics-server-v0.4.5-f449bcfcd 0 0 0 20h kube-system replicaset.apps/metrics-server-v0.4.5-fb4c49dd6 1 1 1 20h kubeflow replicaset.apps/admission-webhook-deployment-74bbdfd87f 1 1 1 40m kubeflow replicaset.apps/admission-webhook-deployment-7df7558c67 0 0 0 20h kubeflow replicaset.apps/cache-deployer-deployment-54d85b57d 0 0 0 20h kubeflow replicaset.apps/cache-deployer-deployment-54d977b449 1 1 1 39m kubeflow replicaset.apps/cache-server-787cf54ddf 1 1 1 40m kubeflow replicaset.apps/cache-server-7c6bdf66d 0 0 0 20h kubeflow replicaset.apps/centraldashboard-649484bbf7 0 0 0 20h kubeflow replicaset.apps/centraldashboard-7449b76fc7 1 1 1 40m kubeflow replicaset.apps/cloud-endpoints-controller-665f44d5d6 1 1 1 40m kubeflow replicaset.apps/cloud-endpoints-controller-778598bb4b 0 0 0 20h kubeflow replicaset.apps/cloudsqlproxy-66b6f98fc 1 1 1 40m kubeflow replicaset.apps/cloudsqlproxy-dfc75bcf8 0 0 0 20h kubeflow replicaset.apps/jupyter-web-app-deployment-555bff79c 1 1 1 40m kubeflow replicaset.apps/jupyter-web-app-deployment-f5c5b7785 0 0 0 20h kubeflow replicaset.apps/katib-controller-58ddb4b856 0 0 0 20h kubeflow replicaset.apps/katib-controller-887877bfb 1 1 1 40m kubeflow replicaset.apps/katib-db-manager-59bb8589f5 1 1 1 40m kubeflow replicaset.apps/katib-db-manager-d77c6757f 0 0 0 20h kubeflow replicaset.apps/katib-mysql-6677d78f6 1 1 1 39m kubeflow replicaset.apps/katib-mysql-7894994f88 0 0 0 20h kubeflow replicaset.apps/katib-ui-7767f98488 1 1 1 40m kubeflow replicaset.apps/katib-ui-f787b9d88 0 0 0 20h kubeflow replicaset.apps/kserve-models-web-app-598655747f 1 1 1 40m kubeflow replicaset.apps/kserve-models-web-app-5cf4f7bbbc 0 0 0 20h kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-75c4ffb445 1 1 1 40m kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-7cd769d5c7 0 0 0 20h kubeflow replicaset.apps/metadata-envoy-deployment-55cf867444 1 1 1 40m kubeflow replicaset.apps/metadata-envoy-deployment-688dbc54b8 0 0 0 20h kubeflow replicaset.apps/metadata-grpc-deployment-65bfdcf5df 1 1 1 40m kubeflow replicaset.apps/metadata-grpc-deployment-7875bcdd58 0 0 0 20h kubeflow replicaset.apps/metadata-writer-7d4f598847 0 0 0 20h kubeflow replicaset.apps/metadata-writer-86fc7c484 1 1 1 40m kubeflow replicaset.apps/minio-6fc74885d7 1 1 1 39m kubeflow replicaset.apps/minio-8bf5bf9fb 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-5c97d959b4 1 1 1 40m kubeflow replicaset.apps/ml-pipeline-667795bf7b 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-persistenceagent-54fbb6797c 1 1 1 40m kubeflow replicaset.apps/ml-pipeline-persistenceagent-dc75c5885 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-695c8ff7c8 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-7f87d84b4b 1 1 1 40m kubeflow replicaset.apps/ml-pipeline-ui-67db999c46 1 1 1 40m kubeflow replicaset.apps/ml-pipeline-ui-d9b7c55c4 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-viewer-crd-685978bb49 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-viewer-crd-75c677c87c 1 1 1 40m kubeflow replicaset.apps/ml-pipeline-visualizationserver-59b6d459f4 0 0 0 20h kubeflow replicaset.apps/ml-pipeline-visualizationserver-8c9cf74d6 1 1 1 40m kubeflow replicaset.apps/notebook-controller-deployment-54fc8cb45c 1 1 1 40m kubeflow replicaset.apps/notebook-controller-deployment-644b9476b4 0 0 0 20h kubeflow replicaset.apps/profiles-deployment-594cdc9679 1 1 1 40m kubeflow replicaset.apps/profiles-deployment-5b75d6cd55 0 0 0 20h kubeflow replicaset.apps/profiles-deployment-9d4d8dc55 0 0 0 79m kubeflow replicaset.apps/tensorboard-controller-controller-manager-6848cb6846 0 0 0 20h kubeflow replicaset.apps/tensorboard-controller-controller-manager-d65bb67b7 1 1 1 40m kubeflow replicaset.apps/tensorboards-web-app-deployment-6d9b97fcf8 0 0 0 20h kubeflow replicaset.apps/tensorboards-web-app-deployment-c5bfbfbdb 1 1 1 40m kubeflow replicaset.apps/training-operator-66d78b99c8 1 1 1 40m kubeflow replicaset.apps/training-operator-6bfc7b8d86 0 0 0 20h kubeflow replicaset.apps/volumes-web-app-deployment-597887dd67 0 0 0 20h kubeflow replicaset.apps/volumes-web-app-deployment-f44fbd8cf 1 1 1 40m kubeflow replicaset.apps/workflow-controller-66b6447f66 1 1 1 40m kubeflow replicaset.apps/workflow-controller-68cdd68b77 0 0 0 20h main replicaset.apps/ml-pipeline-ui-artifact-76c487b7c5 1 1 1 49m main replicaset.apps/ml-pipeline-ui-artifact-d57bd98d7 0 0 0 19h main replicaset.apps/ml-pipeline-visualizationserver-65f5bfb4bf 0 0 0 19h main replicaset.apps/ml-pipeline-visualizationserver-755f46cd7c 1 1 1 49m

NAMESPACE NAME READY AGE istio-system statefulset.apps/backend-updater 1/1 6m3s kubeflow statefulset.apps/kserve-controller-manager 1/1 20h kubeflow statefulset.apps/metacontroller 1/1 20h

NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE istio-system horizontalpodautoscaler.autoscaling/istio-ingressgateway Deployment/istio-ingressgateway 6%/80% 2 5 2 20h istio-system horizontalpodautoscaler.autoscaling/istiod-asm-1104-6 Deployment/istiod-asm-1104-6 1%/80% 2 5 2 20h istio-system horizontalpodautoscaler.autoscaling/istiod-asm-1143-0 Deployment/istiod-asm-1143-0 1%/80% 2 5 2 18h knative-serving horizontalpodautoscaler.autoscaling/activator Deployment/activator 1%/100% 1 20 1 20h knative-serving horizontalpodautoscaler.autoscaling/webhook Deployment/webhook 13%/100% 1 5 1 20h

pablofiumara commented 2 years ago

More information

If I execute:

kubectl --namespace=istio-system get ingress envoy-ingress -o 'jsonpath={.metadata.annotations.ingress.kubernetes.io/backends}'

I get:

{"k8s-be-31224--3b6f3f13cf074301":"UNHEALTHY"}

pablofiumara commented 2 years ago

More information

If I execute

kubectl describe service/istio-ingressgateway -n istio-system

I get:

Name: istio-ingressgateway Namespace: istio-system Labels: app=istio-ingressgateway install.operator.istio.io/owning-resource=unknown install.operator.istio.io/owning-resource-namespace=istio-system istio=ingressgateway istio.io/rev=asm-1143-0 operator.istio.io/component=IngressGateways operator.istio.io/managed=Reconcile operator.istio.io/version=1.14.3-asm.0 release=istio Annotations: backendlock: cloud.google.com/backend-config: {"ports": {"http2":"iap-backendconfig"}} cloud.google.com/neg: {"ingress":false} Selector: app=istio-ingressgateway,istio=ingressgateway,service.istio.io/canonical-revision=asm-1104-6 Type: NodePort IP Family Policy: SingleStack IP Families: IPv4 IP: 10.76.12.252 IPs: 10.76.12.252 Port: status-port 15021/TCP TargetPort: 15021/TCP NodePort: status-port 31223/TCP Endpoints: Port: http2 80/TCP TargetPort: 8080/TCP NodePort: http2 31224/TCP Endpoints: Port: https 443/TCP TargetPort: 8443/TCP NodePort: https 30620/TCP Endpoints: Port: tcp-istiod 15012/TCP TargetPort: 15012/TCP NodePort: tcp-istiod 32691/TCP Endpoints: Port: tls 15443/TCP TargetPort: 15443/TCP NodePort: tls 31407/TCP Endpoints: Session Affinity: None External Traffic Policy: Cluster Events:

It is weird to see the old version there (canonical-revision=asm-1104-6)

pablofiumara commented 2 years ago

Although I executed the following command the old asm version is there

kubectl patch service -n istio-system istio-ingressgateway --type='json' -p='[{"op": "replace", "path": "/spec/selector/service.istio.io~1canonical-revision", "value": "REVISION"}]'

pablofiumara commented 2 years ago

I see a difference: canonical-revision vs ~1canonical-revision

gkcalat commented 2 years ago

Thank you for the details. It seems that the Load Balancer's Health Check marks your backend as UNHEALTHY. You can try the following:

kubectl patch backendconfig iap-backendconfig -n istio-system --type json -p '[{"op": "replace", "path": "/spec/healthCheck/port", "value": 31223}]'
kubectl patch backendconfig iap-backendconfig -n istio-system --type json -p '[{"op": "replace", "path": "/spec/healthCheck/requestPath", "value": "/healthz/ready"}]'

If it won't resolve the 502 error, please follow these instructions.

Regarding ASM upgrade - we have the guidelines here.

  1. First, you will need to change the upstream folder for the istio component in pull-upstream.sh. For example, you can copy it from the master branch.
  2. Then, run make clean-build && bash pull-upstream.sh from kubeflow directory to clean your build folders and pull the new upstream for ASM.
  3. Then follow the guidelines I mentioned above.

Please, pay attention on how to deploy and redeploy workloads on your cluster and take a look on the official ASM guidelines on how to upgrade it.

pablofiumara commented 2 years ago

@gkcalat Thank you for your answer. Executed those two commands and it helped. However, after I execute the following command I get a 502 error again

kubectl label namespace myNamespaceName istio.io/rev=asm-1143-0 istio-injection- --overwrite && kubectl rollout restart deployment -n myNamespaceName

If I execute

kubectl describe pod ml-pipeline-ui-artifact-786b4cdfcf-mz9vm -n main

I get this:

Warning Unhealthy 16m kubelet Readiness probe failed: Get "http://10.80.1.51:15021/healthz/ready": dial tcp 10.80.1.51:15021: connect: connection refused

What should I do? Thanks in advance

pablofiumara commented 2 years ago

I had to execute the following command again

kubectl patch backendconfig iap-backendconfig -n istio-system --type json -p '[{"op": "replace", "path": "/spec/healthCheck/requestPath", "value": "/healthz/ready"}]'

Then I executed

kubectl delete pods --all -n myNamespaceName

And that error went away. However, still getting 502. If I execute:

kubectl --namespace=istio-system describe ingress envoy-ingress

I get:

Name:             envoy-ingress
Labels:           kustomize.component=iap-ingress
Namespace:        istio-system
Address:          publicIPOfMyCluster
Ingress Class:    <none>
Default backend:  istio-ingressgateway:80 (<none>)
Rules:
  Host                                             Path  Backends
  ----                                             ----  --------
 myClusterName.endpoints.myProject.cloud.goog  
                                                   /*   istio-ingressgateway:80 (<none>)
Annotations:                                       ingress.gcp.kubernetes.io/pre-shared-cert: mcrt-6e542d1f-1587-4776-9093-e86d6f320bac
                                                   ingress.kubernetes.io/backends: {"k8s-be-31224--0acfa44a2a499c94":"UNHEALTHY"}
                                                   ingress.kubernetes.io/https-forwarding-rule: k8s2-fs-j6jr5lif-istio-system-envoy-ingress-y9pvup9s
                                                   ingress.kubernetes.io/https-target-proxy: k8s2-ts-j6jr5lif-istio-system-envoy-ingress-y9pvup9s
                                                   ingress.kubernetes.io/ssl-cert: mcrt-6e542d1f-1587-4776-9093-e86d6f320bac
                                                   ingress.kubernetes.io/url-map: k8s2-um-j6jr5lif-istio-system-envoy-ingress-y9pvup9s
                                                   kubernetes.io/ingress.allow-http: false
                                                   kubernetes.io/ingress.global-static-ip-name: testkubev15-ip
                                                   networking.gke.io/managed-certificates: gke-certificate
Events:
  Type    Reason  Age                    From                     Message
  ----    ------  ----                   ----                     -------
  Normal  Sync    7m43s (x22 over 128m)  loadbalancer-controller  Scheduled for sync

Backend seems unhealthy again

If I execute

kubectl describe backendconfig -A

I get

Name:         iap-backendconfig
Namespace:    istio-system
Labels:       kustomize.component=iap-ingress
Annotations:  <none>
API Version:  cloud.google.com/v1
Kind:         BackendConfig
Metadata:
  Creation Timestamp:  2022-08-17T19:23:52Z
  Generation:          5
  Managed Fields:
    API Version:  cloud.google.com/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
        f:labels:
          .:
          f:kustomize.component:
      f:spec:
        .:
        f:healthCheck:
          .:
          f:checkIntervalSec:
          f:healthyThreshold:
          f:port:
          f:timeoutSec:
          f:type:
          f:unhealthyThreshold:
        f:iap:
          .:
          f:enabled:
          f:oauthclientCredentials:
            .:
            f:secretName:
        f:timeoutSec:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2022-08-17T19:23:52Z
    API Version:  cloud.google.com/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:healthCheck:
          f:requestPath:
    Manager:         kubectl-patch
    Operation:       Update
    Time:            2022-08-17T21:28:05Z
  Resource Version:  141928
  UID:               88e486e3-c9ff-4640-8360-a5a75617acec
Spec:
  Health Check:
    Check Interval Sec:   2
    Healthy Threshold:    1
    Port:                 31223
    Request Path:         /healthz/ready
    Timeout Sec:          1
    Type:                 HTTP
    Unhealthy Threshold:  10
  Iap:
    Enabled:  true
    Oauthclient Credentials:
      Secret Name:  kubeflow-oauth
  Timeout Sec:      3600
Events:             <none>

What am I missing?

pablofiumara commented 2 years ago

Executedkubectl rollout restart deployment -n kubeflowbut still it fails. Made the following changes, executed make apply but still it fails https://github.com/kubeflow/gcp-blueprints/pull/374

gkcalat commented 2 years ago

@pablofiumara

I propose you to test the following procedure on a new GKE cluster before running it on your production cluster.

  1. Deploy Kubeflow v1.5.0 with an older ASM on a new cluster by following these instructions. Set necessary environmental variables as shown in kubeflow/env.sh. Pay attention to KF_PROJECT, KF_PROJECT_NUMBER, MGMT_NAME, KF_NAME, and ADMIN_EMAIL. Before running on your production cluster, test this on a new cluster by setting KF_NAME to a new value (e.g. testkf15asm114). After testing you can delete this cluster.

  2. Run the following in terminal:

    mkdir testingasm114 && cd testingasm114
    git clone git@github.com:kubeflow/gcp-blueprints kf15asm114
    cd kf15asm114/kubeflow
    git checkout v1.5.0
    git pull origin v1.5.0
    git cherry-pick f09f44a0e2b07ff81e8992c6e1757ee07e084b23  # creates kserve subfolder
    git cherry-pick 698eca8f1221eba006919a59e0bf11dcf804d54d  # adjusts RequestAuthentication policy creation
    git cherry-pick bdfb0d5406f7bd675636946fbf6ba7d97d015573  # upgrades ASM to 1.13
    git cherry-pick dcc4051d916890d17994e75ebacd33a831786231  # fixes HealthCheck issue
    export ASM_LABEL=asm-1143-0

    This will create a new folder with Kubeflow v1.5.0 and apply neccesary changes to upgrade ASM to v1.13.

  3. Change line 13 and line 14 of kubeflow/asm/Makefile as follows:

    ASM_PACKAGE_VERSION=1.14.3-asm.0+config1
    ASMCLI_SCRIPT_VERSION=asmcli_1.14.3-asm.0-config1

    This will set ASM to v1.14.3. You can try setting to an alternative version.

  4. Run the following inside kubeflow directory:

    bash pull-upstream.sh
    bash kpt-set.sh
    make apply-kcc && make apply

    This will re-deploy Kubeflow with the upgraded ASM to the new cluster defined in KF_NAME and set all components to use the new ASM resource.

  5. Then run the following to see if the new ASM is working properly. Test your Kubeflow cluster (e.g. run a test pipeline)

    kubectl get pod -n istio-system -L istio.io/rev
  6. (Optional) If you believe that everything is OK, you can delete the old ASM (assuming you had asm-1104-6):

    kubectl delete deploy -l app=istio-ingressgateway,istio.io/rev=1104-6 -n istio-system --ignore-not-found=true
    kubectl delete Service,Deployment,HorizontalPodAutoscaler,PodDisruptionBudget istiod-asm-1104-6 -n istio-system --ignore-not-found=true
    kubectl delete IstioOperator installed-state-1104-6 -n istio-system

    No that some of these may return non found errors or warnings.

  7. Verify and test the Kubeflow cluster again.

  8. Once tested, you can delete the new cluster.

  9. If you believe that this will not break anything in your production cluster, go through steps 2-7 on your production cluster. For this, change KF_NAME in the kubeflow/env.sh and run source env.sh inside kubeflow directory before step 2.

I hope this helps :)

pablofiumara commented 2 years ago

@gkcalat Thank you very much for your answer. I am getting the same error. What else can we try? Every pod is running, health check path and port are correct but I am getting error 502. Thanks in advance

gkcalat commented 2 years ago

@pablofiumara

As I mentioned earlier, you should check these instructions.

You can also try deleting backend-updater and iap-enabler. It might be easier at GCP console: chose Kubernetes Engine -> Workloads, filter the workloads by cluster name, choose backend-updater and iap-enabler and click delete button. Then go through steps 2-4 from my previous comment. If everything is OK with your cluster, you can try removing the old ASM (it seems that you have istiod-asm-1104-6 running).

pablofiumara commented 2 years ago

@gkcalat Thanks again. There's something I don't understand about those instructions. It says:

Check that the path is /healthz and corresponds to the path of the readiness probe on the envoy pods

If I execute

kubectl describe pod ml-pipeline-ui-artifact -n main

I get

Warning Unhealthy 35m (x2 over 35m) kubelet Readiness probe failed: Get "http://oneIp:15021/healthz/ready": dial tcp oneIp:15021: connect: connection refused

Should it be http://oneIp:15021/healthz? If so, can you explain how to change it, please? This does not work: https://github.com/kubeflow/gcp-blueprints/issues/379#issuecomment-1218556145

Thanks in advance

pablofiumara commented 2 years ago

If I execute

kubectl logs ml-pipeline-ui-artifact- -n main

I get


{
  argo: {
    archiveArtifactory: 'minio',
    archiveBucketName: 'mlpipeline',
    archiveLogs: false,
    archivePrefix: 'logs'
  },
  artifacts: 'Artifacts config contains credentials, so it is omitted',
  metadata: { envoyService: { host: 'localhost', port: '9090' } },
  pipeline: { host: 'localhost', port: '3001' },
  server: {
    apiVersionPrefix: 'apis/v1beta1',
    basePath: '/pipeline',
    deployment: 'NOT_SPECIFIED',
    hideSideNav: false,
    port: 3000,
    staticDir: '/client'
  },
  viewer: {
    tensorboard: {
      podTemplateSpec: undefined,
      tfImageName: 'tensorflow/tensorflow'
    }
  },
  visualizations: { allowCustomVisualizations: false },
  gkeMetadata: { disabled: false },
  auth: {
    enabled: false,
    kubeflowUserIdHeader: 'x-goog-authenticated-user-email',
    kubeflowUserIdPrefix: 'accounts.google.com:'
  }
}
[HPM] Proxy created: /  ->  http://localhost:9090
[HPM] Proxy created: /  ->  http://127.0.0.1
[HPM] Subscribed to http-proxy events:  [ 'error', 'close' ]
[HPM] Proxy created: /  ->  http://127.0.0.1
[HPM] Subscribed to http-proxy events:  [ 'error', 'close' ]
[HPM] Proxy created: /  ->  http://localhost:3001
[HPM] Subscribed to http-proxy events:  [ 'proxyReq', 'error', 'close' ]
[HPM] Proxy created: /  ->  http://localhost:3001
[HPM] Subscribed to http-proxy events:  [ 'proxyReq', 'error', 'close' ]
(node:1) Warning: Accessing non-existent property 'cat' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
(node:1) Warning: Accessing non-existent property 'cd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'chmod' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'cp' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'dirs' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'pushd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'popd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'echo' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'tempdir' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'pwd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'exec' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'ls' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'find' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'grep' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'head' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'ln' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'mkdir' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'rm' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'mv' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'sed' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'set' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'sort' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'tail' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'test' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'to' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'toEnd' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'touch' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'uniq' of module exports inside circular dependency
(node:1) Warning: Accessing non-existent property 'which' of module exports inside circular dependency
Server listening at http://localhost:3000

I think this might be the cause of the 502 error. Why do you think this is happening?

pablofiumara commented 2 years ago

If I execute

kubectl describe service istio-ingressgateway -n istio-system

I get

Name:                     istio-ingressgateway
Namespace:                istio-system
Labels:                   app=istio-ingressgateway
                          install.operator.istio.io/owning-resource=unknown
                          install.operator.istio.io/owning-resource-namespace=istio-system
                          istio=ingressgateway
                          istio.io/rev=asm-1143-0
                          operator.istio.io/component=IngressGateways
                          operator.istio.io/managed=Reconcile
                          operator.istio.io/version=1.14.3-asm.0
                          release=istio
Annotations:              backendlock: 
                          cloud.google.com/backend-config: {"ports": {"http2":"iap-backendconfig"}}
                          cloud.google.com/neg: {"ingress":false}
Selector:                 app=istio-ingressgateway,istio=ingressgateway,service.istio.io/canonical-revision=asm-1143-0
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       oneIp
IPs:                      oneIp
Port:                     status-port  15021/TCP
TargetPort:               15021/TCP
NodePort:                 status-port  31223/TCP
Endpoints:                <none>
Port:                     http2  80/TCP
TargetPort:               8080/TCP
NodePort:                 http2  31224/TCP
Endpoints:                <none>
Port:                     https  443/TCP
TargetPort:               8443/TCP
NodePort:                 https  31980/TCP
Endpoints:                <none>
Port:                     tcp-istiod  15012/TCP
TargetPort:               15012/TCP
NodePort:                 tcp-istiod  31277/TCP
Endpoints:                <none>
Port:                     tls  15443/TCP
TargetPort:               15443/TCP
NodePort:                 tls  30616/TCP
Endpoints:                <none>
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
gkcalat commented 2 years ago

@gkcalat Thanks again. There's something I don't understand about those instructions. It says:

Check that the path is /healthz and corresponds to the path of the readiness probe on the envoy pods

If I execute

kubectl describe pod ml-pipeline-ui-artifact -n main

I get

Warning Unhealthy 35m (x2 over 35m) kubelet Readiness probe failed: Get "http://oneIp:15021/healthz/ready": dial tcp oneIp:15021: connect: connection refused

Should it be http://oneIp:15021/healthz? If so, can you explain how to change it, please? This does not work: #379 (comment)

Thanks in advance

Let's take a step back. You mentioned earlier that setting port and path for the Health Check helped to resolve the issue. That means that the problem was in the Health Check.

I see that you have ASM 1.14 running on the cluster. Have you removed the older ASM? If no, delete it with:

kubectl delete Service,Deployment,HorizontalPodAutoscaler,PodDisruptionBudget istiod-asm-1104-6 -n istio-system --ignore-not-found=true

Then delete backend-updater and iap-enabler workloads in the GCP console. These two are supposed to be deleted at the end of make apply. Somehow you have them still running on your cluster. Note, that they may come with a suffix e.g. backend-updater-0 or so.

Finally, update the backend config:

kubectl patch backendconfig iap-backendconfig -n istio-system --type json -p '[{"op": "replace", "path": "/spec/healthCheck/port", "value": 31223}]'
kubectl patch backendconfig iap-backendconfig -n istio-system --type json -p '[{"op": "replace", "path": "/spec/healthCheck/requestPath", "value": "/healthz/ready"}]'

Once done, give it some time and try opening your endpoint through https (not http).

pablofiumara commented 2 years ago

@gkcalat Thank you very much for your new answer. I have just executed:

kubectl delete Service,Deployment,HorizontalPodAutoscaler,PodDisruptionBudget istiod-asm-1104-6 -n istio-system --ignore-not-found=true

Then I made sure backend-updater and iap-enabler are deleted.

Then updated the backend config executing again those two commands (I got the message backendconfig.cloud.google.com/iap-backendconfig patched (no change)).

I am still getting the same error when opening my endpoint through https. I started getting this (https://github.com/kubeflow/gcp-blueprints/issues/379#issuecomment-1220058759) error again when I executed kubectl patch service -n istio-system istio-ingressgateway --type='json' -p='[{"op": "replace", "path": "/spec/selector/service.istio.io~1canonical-revision", "value": "asm-1143-0"}]' as the official documentation says (https://cloud.google.com/service-mesh/docs/unified-install/upgrade#deploying_and_redeploying_workloads)

What am I missing?

This is my current setup:

NAMESPACE NAME READY STATUS RESTARTS AGE asm-system pod/canonical-service-controller-manager-7f7888d9b6-cztst 2/2 Running 0 14h cert-manager pod/cert-manager-76b7c557d5-x27bx 1/1 Running 0 14h cert-manager pod/cert-manager-cainjector-655d695d74-qkkr2 1/1 Running 0 14h cert-manager pod/cert-manager-webhook-7955b9bb97-s4c8f 1/1 Running 0 14h gke-connect pod/gke-connect-agent-20220812-00-00-79dd5945b9-v6bzt 1/1 Running 0 14h istio-system pod/istio-ingressgateway-86b848cfdf-9pcdf 1/1 Running 0 8m9s istio-system pod/istio-ingressgateway-86b848cfdf-x75dv 1/1 Running 0 8m9s istio-system pod/istiod-asm-1143-0-8c89475d8-6nsd4 1/1 Running 0 8m8s istio-system pod/istiod-asm-1143-0-8c89475d8-h9jd4 1/1 Running 0 8m8s istio-system pod/whoami-app-7f6f84d4fd-qjfr6 1/1 Running 0 8m8s knative-serving pod/activator-6b5c86d468-h9hdv 2/2 Running 0 8m15s knative-serving pod/autoscaler-7f67bb5d9d-gnxg4 2/2 Running 0 8m14s knative-serving pod/controller-7c6996c98b-xjz48 2/2 Running 0 8m14s knative-serving pod/istio-webhook-7cdd88b69d-nfj4r 2/2 Running 0 8m14s knative-serving pod/networking-istio-7bf99bf67f-wgwz8 1/1 Running 0 8m14s knative-serving pod/webhook-7995d94ff4-r469m 2/2 Running 0 8m14s kube-system pod/event-exporter-gke-5479fd58c8-qxkzg 2/2 Running 0 14h kube-system pod/fluentbit-gke-495zt 2/2 Running 0 109m kube-system pod/fluentbit-gke-cg9rn 2/2 Running 0 109m kube-system pod/gke-metadata-server-4c6x8 1/1 Running 0 109m kube-system pod/gke-metadata-server-nxjjq 1/1 Running 0 109m kube-system pod/gke-metrics-agent-5nkwg 1/1 Running 0 109m kube-system pod/gke-metrics-agent-w9srm 1/1 Running 0 109m kube-system pod/konnectivity-agent-65d98fb675-5kw49 1/1 Running 0 14h kube-system pod/konnectivity-agent-65d98fb675-pmf6r 1/1 Running 0 14h kube-system pod/konnectivity-agent-autoscaler-6b86f667c9-msp4v 1/1 Running 0 14h kube-system pod/kube-dns-697dc8fc8b-5pdn2 4/4 Running 0 14h kube-system pod/kube-dns-697dc8fc8b-lpb4z 4/4 Running 0 14h kube-system pod/kube-dns-autoscaler-844c9d9448-thn2r 1/1 Running 0 14h kube-system pod/kube-proxy-gke-testkube15core-default-pool-262e531b-3k12 1/1 Running 0 109m kube-system pod/kube-proxy-gke-testkube15core-default-pool-262e531b-pbht 1/1 Running 0 108m kube-system pod/l7-default-backend-69fb9fd9f9-w2jh6 1/1 Running 0 14h kube-system pod/metrics-server-v0.4.5-fb4c49dd6-zlhnj 2/2 Running 0 14h kube-system pod/netd-svxs8 1/1 Running 0 109m kube-system pod/netd-w9l45 1/1 Running 0 109m kube-system pod/pdcsi-node-2d2kh 2/2 Running 0 109m kube-system pod/pdcsi-node-kl2nt 2/2 Running 0 109m kubeflow pod/admission-webhook-deployment-5546f5c4d8-mgmdn 1/1 Running 0 8m30s kubeflow pod/cache-deployer-deployment-56c9b7c4df-knfk5 2/2 Running 2 7m48s kubeflow pod/cache-server-6556bc85c5-snmxf 2/2 Running 0 8m30s kubeflow pod/centraldashboard-8c6c44664-5ggl7 2/2 Running 0 8m30s kubeflow pod/cloud-endpoints-controller-66fd7b74f-mdn2g 1/1 Running 0 8m30s kubeflow pod/cloudsqlproxy-59487dbf4-9qkdg 2/2 Running 1 8m29s kubeflow pod/jupyter-web-app-deployment-979bbdc74-h8qrj 1/1 Running 0 8m29s kubeflow pod/katib-controller-88d7d6947-4nzzb 1/1 Running 0 8m29s kubeflow pod/katib-db-manager-596f4c456d-kphhq 1/1 Running 0 8m29s kubeflow pod/katib-mysql-6bd9c6f66f-prqml 1/1 Running 0 8m14s kubeflow pod/katib-ui-575dc68dcc-2mnls 1/1 Running 0 8m29s kubeflow pod/kserve-controller-manager-0 2/2 Running 0 96m kubeflow pod/kserve-models-web-app-677459969d-sfdc9 2/2 Running 0 8m28s kubeflow pod/kubeflow-pipelines-profile-controller-7b997798f4-nsdnq 1/1 Running 0 8m28s kubeflow pod/metacontroller-0 1/1 Running 0 96m kubeflow pod/metadata-envoy-deployment-59bd74556b-wjz58 1/1 Running 0 8m28s kubeflow pod/metadata-grpc-deployment-5cbf4744fb-mpftb 2/2 Running 2 8m28s kubeflow pod/metadata-writer-6bd6475c7d-bbq7v 2/2 Running 0 8m27s kubeflow pod/minio-58b88c9986-wzhcv 2/2 Running 0 8m16s kubeflow pod/ml-pipeline-5fd5667684-8wkz9 2/2 Running 2 8m27s kubeflow pod/ml-pipeline-persistenceagent-69485b9fdf-h6m45 2/2 Running 0 8m27s kubeflow pod/ml-pipeline-scheduledworkflow-947b7c7d6-dxskc 2/2 Running 0 8m27s kubeflow pod/ml-pipeline-ui-bdd4d5c64-4jv52 2/2 Running 0 8m26s kubeflow pod/ml-pipeline-viewer-crd-76f5b9875f-clpg8 2/2 Running 2 8m26s kubeflow pod/ml-pipeline-visualizationserver-78b8ff7b8-bgv6b 2/2 Running 0 8m26s kubeflow pod/notebook-controller-deployment-cf5cf5d6b-nrfzj 2/2 Running 2 8m26s kubeflow pod/profiles-deployment-54df459685-lm6qq 3/3 Running 2 8m26s kubeflow pod/tensorboard-controller-controller-manager-548766d976-zpfct 3/3 Running 2 8m25s kubeflow pod/tensorboards-web-app-deployment-5bf48cbc5f-4ztk2 1/1 Running 0 8m25s kubeflow pod/training-operator-6bdd889477-qzqxc 1/1 Running 0 8m25s kubeflow pod/volumes-web-app-deployment-7d66d69cb4-hqhbd 1/1 Running 0 8m25s kubeflow pod/workflow-controller-6d45bb6c59-qwmjj 2/2 Running 2 8m24s main pod/ml-pipeline-ui-artifact-7b65cb88fd-6z4xd 2/2 Running 0 10m main pod/ml-pipeline-visualizationserver-794f8cdfc6-rltk9 2/2 Running 0 10m

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE asm-system service/canonical-service-controller-manager-metrics-service ClusterIP 10.64.9.106 8443/TCP 18h cert-manager service/cert-manager ClusterIP 10.64.1.250 9402/TCP 18h cert-manager service/cert-manager-webhook ClusterIP 10.64.3.173 443/TCP 18h default service/kubernetes ClusterIP 10.64.0.1 443/TCP 18h gke-connect service/gke-connect-monitoring ClusterIP 10.64.15.140 8080/TCP 18h istio-system service/istio-ingressgateway NodePort 10.64.14.211 15021:31223/TCP,80:31224/TCP,443:31980/TCP,15012:31277/TCP,15443:30616/TCP 18h istio-system service/istiod ClusterIP 10.64.1.243 15010/TCP,15012/TCP,443/TCP,15014/TCP 18h istio-system service/istiod-asm-1143-0 ClusterIP 10.64.3.229 15010/TCP,15012/TCP,443/TCP,15014/TCP 18h istio-system service/knative-local-gateway ClusterIP 10.64.11.188 80/TCP 18h istio-system service/whoami-app ClusterIP 10.64.1.86 80/TCP 18h knative-serving service/activator-service ClusterIP 10.64.15.61 9090/TCP,8008/TCP,80/TCP,81/TCP 18h knative-serving service/autoscaler ClusterIP 10.64.4.209 9090/TCP,8008/TCP,8080/TCP 18h knative-serving service/autoscaler-bucket-00-of-01 ClusterIP 10.64.12.158 8080/TCP 18h knative-serving service/controller ClusterIP 10.64.14.79 9090/TCP,8008/TCP 18h knative-serving service/istio-webhook ClusterIP 10.64.11.81 9090/TCP,8008/TCP,443/TCP 18h knative-serving service/webhook ClusterIP 10.64.9.210 9090/TCP,8008/TCP,443/TCP 18h kube-system service/default-http-backend NodePort 10.64.13.124 80:30615/TCP 18h kube-system service/kube-dns ClusterIP 10.64.0.10 53/UDP,53/TCP 18h kube-system service/metrics-server ClusterIP 10.64.3.102 443/TCP 18h kubeflow service/admission-webhook-service ClusterIP 10.64.0.108 443/TCP 18h kubeflow service/cache-server ClusterIP 10.64.5.166 443/TCP 18h kubeflow service/centraldashboard ClusterIP 10.64.4.19 80/TCP 18h kubeflow service/cloud-endpoints-controller ClusterIP 10.64.7.52 80/TCP 18h kubeflow service/jupyter-web-app-service ClusterIP 10.64.7.41 80/TCP 18h kubeflow service/katib-controller ClusterIP 10.64.9.19 443/TCP,8080/TCP 18h kubeflow service/katib-db-manager ClusterIP 10.64.7.114 6789/TCP 18h kubeflow service/katib-mysql ClusterIP 10.64.13.27 3306/TCP 18h kubeflow service/katib-ui ClusterIP 10.64.9.200 80/TCP 18h kubeflow service/kserve-controller-manager-metrics-service ClusterIP 10.64.1.201 8443/TCP 18h kubeflow service/kserve-controller-manager-service ClusterIP 10.64.9.44 443/TCP 18h kubeflow service/kserve-models-web-app ClusterIP 10.64.8.30 80/TCP 18h kubeflow service/kserve-webhook-server-service ClusterIP 10.64.2.229 443/TCP 18h kubeflow service/kubeflow-pipelines-profile-controller ClusterIP 10.64.14.119 80/TCP 18h kubeflow service/metadata-envoy-service ClusterIP 10.64.3.247 9090/TCP 18h kubeflow service/metadata-grpc-service ClusterIP 10.64.13.81 8080/TCP 18h kubeflow service/minio-service ClusterIP 10.64.10.174 9000/TCP 18h kubeflow service/ml-pipeline ClusterIP 10.64.4.31 8888/TCP,8887/TCP 18h kubeflow service/ml-pipeline-ui ClusterIP 10.64.5.30 80/TCP 18h kubeflow service/ml-pipeline-visualizationserver ClusterIP 10.64.10.38 8888/TCP 18h kubeflow service/mysql ClusterIP 10.64.14.12 3306/TCP 18h kubeflow service/notebook-controller-service ClusterIP 10.64.12.46 443/TCP 18h kubeflow service/profiles-kfam ClusterIP 10.64.9.136 8081/TCP 18h kubeflow service/tensorboard-controller-controller-manager-metrics-service ClusterIP 10.64.1.173 8443/TCP 18h kubeflow service/tensorboards-web-app-service ClusterIP 10.64.6.31 80/TCP 18h kubeflow service/training-operator ClusterIP 10.64.4.29 8080/TCP 18h kubeflow service/volumes-web-app-service ClusterIP 10.64.4.114 80/TCP 18h kubeflow service/workflow-controller-metrics ClusterIP 10.64.2.50 9090/TCP 18h main service/ml-pipeline-ui-artifact ClusterIP 10.64.12.30 80/TCP 18h main service/ml-pipeline-visualizationserver ClusterIP 10.64.13.49 8888/TCP 18h

NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-system daemonset.apps/fluentbit-gke 2 2 2 2 2 kubernetes.io/os=linux 18h kube-system daemonset.apps/gke-metadata-server 2 2 2 2 2 beta.kubernetes.io/os=linux,iam.gke.io/gke-metadata-server-enabled=true 18h kube-system daemonset.apps/gke-metrics-agent 2 2 2 2 2 18h kube-system daemonset.apps/gke-metrics-agent-scaling-10 0 0 0 0 0 18h kube-system daemonset.apps/gke-metrics-agent-scaling-20 0 0 0 0 0 18h kube-system daemonset.apps/gke-metrics-agent-windows 0 0 0 0 0 kubernetes.io/os=windows 18h kube-system daemonset.apps/kube-proxy 0 0 0 0 0 kubernetes.io/os=linux,node.kubernetes.io/kube-proxy-ds-ready=true 18h kube-system daemonset.apps/metadata-proxy-v0.1 0 0 0 0 0 cloud.google.com/metadata-proxy-ready=true,kubernetes.io/os=linux 18h kube-system daemonset.apps/netd 2 2 2 2 2 cloud.google.com/gke-netd-ready=true,kubernetes.io/os=linux 18h kube-system daemonset.apps/nvidia-gpu-device-plugin 0 0 0 0 0 18h kube-system daemonset.apps/pdcsi-node 2 2 2 2 2 kubernetes.io/os=linux 18h kube-system daemonset.apps/pdcsi-node-windows 0 0 0 0 0 kubernetes.io/os=windows 18h

NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE asm-system deployment.apps/canonical-service-controller-manager 1/1 1 1 18h cert-manager deployment.apps/cert-manager 1/1 1 1 18h cert-manager deployment.apps/cert-manager-cainjector 1/1 1 1 18h cert-manager deployment.apps/cert-manager-webhook 1/1 1 1 18h gke-connect deployment.apps/gke-connect-agent-20220812-00-00 1/1 1 1 18h istio-system deployment.apps/istio-ingressgateway 2/2 2 2 18h istio-system deployment.apps/istiod-asm-1143-0 2/2 2 2 18h istio-system deployment.apps/whoami-app 1/1 1 1 18h knative-serving deployment.apps/activator 1/1 1 1 18h knative-serving deployment.apps/autoscaler 1/1 1 1 18h knative-serving deployment.apps/controller 1/1 1 1 18h knative-serving deployment.apps/istio-webhook 1/1 1 1 18h knative-serving deployment.apps/networking-istio 1/1 1 1 18h knative-serving deployment.apps/webhook 1/1 1 1 18h kube-system deployment.apps/event-exporter-gke 1/1 1 1 18h kube-system deployment.apps/konnectivity-agent 2/2 2 2 18h kube-system deployment.apps/konnectivity-agent-autoscaler 1/1 1 1 18h kube-system deployment.apps/kube-dns 2/2 2 2 18h kube-system deployment.apps/kube-dns-autoscaler 1/1 1 1 18h kube-system deployment.apps/l7-default-backend 1/1 1 1 18h kube-system deployment.apps/metrics-server-v0.4.5 1/1 1 1 18h kubeflow deployment.apps/admission-webhook-deployment 1/1 1 1 18h kubeflow deployment.apps/cache-deployer-deployment 1/1 1 1 18h kubeflow deployment.apps/cache-server 1/1 1 1 18h kubeflow deployment.apps/centraldashboard 1/1 1 1 18h kubeflow deployment.apps/cloud-endpoints-controller 1/1 1 1 18h kubeflow deployment.apps/cloudsqlproxy 1/1 1 1 18h kubeflow deployment.apps/jupyter-web-app-deployment 1/1 1 1 18h kubeflow deployment.apps/katib-controller 1/1 1 1 18h kubeflow deployment.apps/katib-db-manager 1/1 1 1 18h kubeflow deployment.apps/katib-mysql 1/1 1 1 18h kubeflow deployment.apps/katib-ui 1/1 1 1 18h kubeflow deployment.apps/kserve-models-web-app 1/1 1 1 18h kubeflow deployment.apps/kubeflow-pipelines-profile-controller 1/1 1 1 18h kubeflow deployment.apps/metadata-envoy-deployment 1/1 1 1 18h kubeflow deployment.apps/metadata-grpc-deployment 1/1 1 1 18h kubeflow deployment.apps/metadata-writer 1/1 1 1 18h kubeflow deployment.apps/minio 1/1 1 1 18h kubeflow deployment.apps/ml-pipeline 1/1 1 1 18h kubeflow deployment.apps/ml-pipeline-persistenceagent 1/1 1 1 18h kubeflow deployment.apps/ml-pipeline-scheduledworkflow 1/1 1 1 18h kubeflow deployment.apps/ml-pipeline-ui 1/1 1 1 18h kubeflow deployment.apps/ml-pipeline-viewer-crd 1/1 1 1 18h kubeflow deployment.apps/ml-pipeline-visualizationserver 1/1 1 1 18h kubeflow deployment.apps/notebook-controller-deployment 1/1 1 1 18h kubeflow deployment.apps/profiles-deployment 1/1 1 1 18h kubeflow deployment.apps/tensorboard-controller-controller-manager 1/1 1 1 18h kubeflow deployment.apps/tensorboards-web-app-deployment 1/1 1 1 18h kubeflow deployment.apps/training-operator 1/1 1 1 18h kubeflow deployment.apps/volumes-web-app-deployment 1/1 1 1 18h kubeflow deployment.apps/workflow-controller 1/1 1 1 18h main deployment.apps/ml-pipeline-ui-artifact 1/1 1 1 18h main deployment.apps/ml-pipeline-visualizationserver 1/1 1 1 18h

NAMESPACE NAME DESIRED CURRENT READY AGE asm-system replicaset.apps/canonical-service-controller-manager-67c8f5fff5 0 0 0 18h asm-system replicaset.apps/canonical-service-controller-manager-7f7888d9b6 1 1 1 18h cert-manager replicaset.apps/cert-manager-76b7c557d5 1 1 1 18h cert-manager replicaset.apps/cert-manager-cainjector-655d695d74 1 1 1 18h cert-manager replicaset.apps/cert-manager-webhook-7955b9bb97 1 1 1 18h gke-connect replicaset.apps/gke-connect-agent-20220812-00-00-79dd5945b9 1 1 1 18h istio-system replicaset.apps/istio-ingressgateway-6f97cb7d8b 0 0 0 101m istio-system replicaset.apps/istio-ingressgateway-7bdb677664 0 0 0 18h istio-system replicaset.apps/istio-ingressgateway-7c7dc89796 0 0 0 18h istio-system replicaset.apps/istio-ingressgateway-7f9667b55f 0 0 0 15h istio-system replicaset.apps/istio-ingressgateway-86b848cfdf 2 2 2 8m10s istio-system replicaset.apps/istiod-asm-1143-0-5659fc9bcc 0 0 0 101m istio-system replicaset.apps/istiod-asm-1143-0-6b87c9dc99 0 0 0 18h istio-system replicaset.apps/istiod-asm-1143-0-6b8d4998f8 0 0 0 15h istio-system replicaset.apps/istiod-asm-1143-0-8c89475d8 2 2 2 8m9s istio-system replicaset.apps/whoami-app-5458468844 0 0 0 15h istio-system replicaset.apps/whoami-app-7f6f84d4fd 1 1 1 8m9s istio-system replicaset.apps/whoami-app-845c7fdc4 0 0 0 18h istio-system replicaset.apps/whoami-app-96bc6547 0 0 0 101m knative-serving replicaset.apps/activator-5545fc58f5 0 0 0 18h knative-serving replicaset.apps/activator-5754c5ff55 0 0 0 18h knative-serving replicaset.apps/activator-59f44c8c96 0 0 0 17h knative-serving replicaset.apps/activator-5b9685c6f5 0 0 0 101m knative-serving replicaset.apps/activator-6b5c86d468 1 1 1 8m16s knative-serving replicaset.apps/autoscaler-58fc8d57d5 0 0 0 18h knative-serving replicaset.apps/autoscaler-68954b89b5 0 0 0 18h knative-serving replicaset.apps/autoscaler-6c6cb57797 0 0 0 101m knative-serving replicaset.apps/autoscaler-7666b785d 0 0 0 17h knative-serving replicaset.apps/autoscaler-7f67bb5d9d 1 1 1 8m16s knative-serving replicaset.apps/controller-5c8754f49c 0 0 0 101m knative-serving replicaset.apps/controller-669cb6bdf7 0 0 0 18h knative-serving replicaset.apps/controller-67d59f5b74 0 0 0 17h knative-serving replicaset.apps/controller-7bf7955dbf 0 0 0 18h knative-serving replicaset.apps/controller-7c6996c98b 1 1 1 8m15s knative-serving replicaset.apps/istio-webhook-5f876d5c85 0 0 0 18h knative-serving replicaset.apps/istio-webhook-64499db84f 0 0 0 101m knative-serving replicaset.apps/istio-webhook-7cdd88b69d 1 1 1 8m15s knative-serving replicaset.apps/istio-webhook-857cfbb6b5 0 0 0 18h knative-serving replicaset.apps/istio-webhook-868d87b8b 0 0 0 17h knative-serving replicaset.apps/networking-istio-55bcdd9748 0 0 0 101m knative-serving replicaset.apps/networking-istio-6bbc6b9664 0 0 0 18h knative-serving replicaset.apps/networking-istio-6db46fc6d 0 0 0 18h knative-serving replicaset.apps/networking-istio-7bf99bf67f 1 1 1 8m15s knative-serving replicaset.apps/networking-istio-dfddbd67f 0 0 0 17h knative-serving replicaset.apps/webhook-6946b99875 0 0 0 18h knative-serving replicaset.apps/webhook-6cdb5f76bc 0 0 0 18h knative-serving replicaset.apps/webhook-6dfb5bb978 0 0 0 17h knative-serving replicaset.apps/webhook-7995d94ff4 1 1 1 8m15s knative-serving replicaset.apps/webhook-79fc7579cf 0 0 0 101m kube-system replicaset.apps/event-exporter-gke-5479fd58c8 1 1 1 18h kube-system replicaset.apps/konnectivity-agent-65d98fb675 2 2 2 18h kube-system replicaset.apps/konnectivity-agent-autoscaler-6b86f667c9 1 1 1 18h kube-system replicaset.apps/kube-dns-697dc8fc8b 2 2 2 18h kube-system replicaset.apps/kube-dns-autoscaler-844c9d9448 1 1 1 18h kube-system replicaset.apps/l7-default-backend-69fb9fd9f9 1 1 1 18h kube-system replicaset.apps/metrics-server-v0.4.5-f449bcfcd 0 0 0 18h kube-system replicaset.apps/metrics-server-v0.4.5-fb4c49dd6 1 1 1 18h kubeflow replicaset.apps/admission-webhook-deployment-5546f5c4d8 1 1 1 8m31s kubeflow replicaset.apps/admission-webhook-deployment-5f8b94895c 0 0 0 106m kubeflow replicaset.apps/admission-webhook-deployment-65fd7fbb7f 0 0 0 17h kubeflow replicaset.apps/admission-webhook-deployment-75cd49bc46 0 0 0 17h kubeflow replicaset.apps/admission-webhook-deployment-7df7558c67 0 0 0 18h kubeflow replicaset.apps/cache-deployer-deployment-54d85b57d 0 0 0 18h kubeflow replicaset.apps/cache-deployer-deployment-55df4899c 0 0 0 17h kubeflow replicaset.apps/cache-deployer-deployment-56c9b7c4df 1 1 1 7m49s kubeflow replicaset.apps/cache-deployer-deployment-69d7766cb6 0 0 0 17h kubeflow replicaset.apps/cache-deployer-deployment-74ffb799d 0 0 0 106m kubeflow replicaset.apps/cache-deployer-deployment-7598788cc5 0 0 0 16h kubeflow replicaset.apps/cache-server-5456d9b457 0 0 0 16h kubeflow replicaset.apps/cache-server-54b5779886 0 0 0 17h kubeflow replicaset.apps/cache-server-597dfdddf6 0 0 0 106m kubeflow replicaset.apps/cache-server-6556bc85c5 1 1 1 8m31s kubeflow replicaset.apps/cache-server-6c8d85d8df 0 0 0 17h kubeflow replicaset.apps/cache-server-7c6bdf66d 0 0 0 18h kubeflow replicaset.apps/centraldashboard-649484bbf7 0 0 0 18h kubeflow replicaset.apps/centraldashboard-694fdfc45c 0 0 0 17h kubeflow replicaset.apps/centraldashboard-7fd575999 0 0 0 106m kubeflow replicaset.apps/centraldashboard-8c6c44664 1 1 1 8m31s kubeflow replicaset.apps/centraldashboard-f88c48446 0 0 0 17h kubeflow replicaset.apps/cloud-endpoints-controller-66fd7b74f 1 1 1 8m31s kubeflow replicaset.apps/cloud-endpoints-controller-76995d7795 0 0 0 17h kubeflow replicaset.apps/cloud-endpoints-controller-778598bb4b 0 0 0 18h kubeflow replicaset.apps/cloud-endpoints-controller-7b96974dd6 0 0 0 17h kubeflow replicaset.apps/cloud-endpoints-controller-b88b84bfb 0 0 0 106m kubeflow replicaset.apps/cloudsqlproxy-59487dbf4 1 1 1 8m30s kubeflow replicaset.apps/cloudsqlproxy-7668fd76b9 0 0 0 17h kubeflow replicaset.apps/cloudsqlproxy-8996c6464 0 0 0 17h kubeflow replicaset.apps/cloudsqlproxy-9d658867f 0 0 0 106m kubeflow replicaset.apps/cloudsqlproxy-dfc75bcf8 0 0 0 18h kubeflow replicaset.apps/jupyter-web-app-deployment-68bfddf8b9 0 0 0 17h kubeflow replicaset.apps/jupyter-web-app-deployment-74b97c58b7 0 0 0 106m kubeflow replicaset.apps/jupyter-web-app-deployment-8585965464 0 0 0 17h kubeflow replicaset.apps/jupyter-web-app-deployment-979bbdc74 1 1 1 8m30s kubeflow replicaset.apps/jupyter-web-app-deployment-f5c5b7785 0 0 0 18h kubeflow replicaset.apps/katib-controller-58ddb4b856 0 0 0 18h kubeflow replicaset.apps/katib-controller-7cccd58948 0 0 0 17h kubeflow replicaset.apps/katib-controller-84b965666f 0 0 0 17h kubeflow replicaset.apps/katib-controller-868bbc7774 0 0 0 106m kubeflow replicaset.apps/katib-controller-88d7d6947 1 1 1 8m30s kubeflow replicaset.apps/katib-db-manager-596f4c456d 1 1 1 8m30s kubeflow replicaset.apps/katib-db-manager-74bc76d569 0 0 0 106m kubeflow replicaset.apps/katib-db-manager-75977c6878 0 0 0 17h kubeflow replicaset.apps/katib-db-manager-c75cd89ff 0 0 0 17h kubeflow replicaset.apps/katib-db-manager-d77c6757f 0 0 0 18h kubeflow replicaset.apps/katib-mysql-55bc9d78dc 0 0 0 17h kubeflow replicaset.apps/katib-mysql-6578c74d86 0 0 0 106m kubeflow replicaset.apps/katib-mysql-6bd9c6f66f 1 1 1 8m15s kubeflow replicaset.apps/katib-mysql-7894994f88 0 0 0 18h kubeflow replicaset.apps/katib-mysql-7996578464 0 0 0 17h kubeflow replicaset.apps/katib-ui-575dc68dcc 1 1 1 8m30s kubeflow replicaset.apps/katib-ui-5946cdff7d 0 0 0 106m kubeflow replicaset.apps/katib-ui-5c589dd8f5 0 0 0 17h kubeflow replicaset.apps/katib-ui-9dd99d56d 0 0 0 17h kubeflow replicaset.apps/katib-ui-f787b9d88 0 0 0 18h kubeflow replicaset.apps/kserve-models-web-app-5cf4f7bbbc 0 0 0 18h kubeflow replicaset.apps/kserve-models-web-app-6585b45f57 0 0 0 106m kubeflow replicaset.apps/kserve-models-web-app-677459969d 1 1 1 8m29s kubeflow replicaset.apps/kserve-models-web-app-6867995467 0 0 0 17h kubeflow replicaset.apps/kserve-models-web-app-698f6c484 0 0 0 17h kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-54954d4d99 0 0 0 106m kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-549866d789 0 0 0 17h kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-59ff4c9db4 0 0 0 17h kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-7b997798f4 1 1 1 8m29s kubeflow replicaset.apps/kubeflow-pipelines-profile-controller-7cd769d5c7 0 0 0 18h kubeflow replicaset.apps/metadata-envoy-deployment-59bd74556b 1 1 1 8m29s kubeflow replicaset.apps/metadata-envoy-deployment-5d6876d479 0 0 0 17h kubeflow replicaset.apps/metadata-envoy-deployment-5d98466d4d 0 0 0 16h kubeflow replicaset.apps/metadata-envoy-deployment-688dbc54b8 0 0 0 18h kubeflow replicaset.apps/metadata-envoy-deployment-7b8887b568 0 0 0 17h kubeflow replicaset.apps/metadata-envoy-deployment-8495895577 0 0 0 106m kubeflow replicaset.apps/metadata-grpc-deployment-5cbf4744fb 1 1 1 8m29s kubeflow replicaset.apps/metadata-grpc-deployment-5f8bc99f79 0 0 0 106m kubeflow replicaset.apps/metadata-grpc-deployment-657c7584c7 0 0 0 17h kubeflow replicaset.apps/metadata-grpc-deployment-748d6576d6 0 0 0 17h kubeflow replicaset.apps/metadata-grpc-deployment-7875bcdd58 0 0 0 18h kubeflow replicaset.apps/metadata-writer-5b4cc4cf9b 0 0 0 16h kubeflow replicaset.apps/metadata-writer-64486b867 0 0 0 17h kubeflow replicaset.apps/metadata-writer-6bd6475c7d 1 1 1 8m29s kubeflow replicaset.apps/metadata-writer-75664f5f45 0 0 0 106m kubeflow replicaset.apps/metadata-writer-77bcfbc888 0 0 0 17h kubeflow replicaset.apps/metadata-writer-7d4f598847 0 0 0 18h kubeflow replicaset.apps/minio-58b88c9986 1 1 1 8m17s kubeflow replicaset.apps/minio-6cb79b6f96 0 0 0 17h kubeflow replicaset.apps/minio-6f4b695759 0 0 0 17h kubeflow replicaset.apps/minio-7dd7b89877 0 0 0 106m kubeflow replicaset.apps/minio-8bf5bf9fb 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-545b685845 0 0 0 106m kubeflow replicaset.apps/ml-pipeline-5d88988849 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-5fd5667684 1 1 1 8m28s kubeflow replicaset.apps/ml-pipeline-6469cb44f8 0 0 0 16h kubeflow replicaset.apps/ml-pipeline-667795bf7b 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-ffcd698d 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-persistenceagent-67dc5777c5 0 0 0 16h kubeflow replicaset.apps/ml-pipeline-persistenceagent-69485b9fdf 1 1 1 8m28s kubeflow replicaset.apps/ml-pipeline-persistenceagent-6cfdd6cdcb 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-persistenceagent-76d9798d6d 0 0 0 106m kubeflow replicaset.apps/ml-pipeline-persistenceagent-7f5c774cdf 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-persistenceagent-dc75c5885 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-557f8db876 0 0 0 16h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-5947dbbbb7 0 0 0 106m kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-695c8ff7c8 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-6d88cc9b9d 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-7c47fbb695 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-scheduledworkflow-947b7c7d6 1 1 1 8m28s kubeflow replicaset.apps/ml-pipeline-ui-7654fcc79c 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-ui-7d6bcdc666 0 0 0 16h kubeflow replicaset.apps/ml-pipeline-ui-9c749c84c 0 0 0 106m kubeflow replicaset.apps/ml-pipeline-ui-b8f769474 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-ui-bdd4d5c64 1 1 1 8m27s kubeflow replicaset.apps/ml-pipeline-ui-d9b7c55c4 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-viewer-crd-596bf7fffd 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-viewer-crd-644846784f 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-viewer-crd-685978bb49 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-viewer-crd-6f8fc78c44 0 0 0 106m kubeflow replicaset.apps/ml-pipeline-viewer-crd-76f5b9875f 1 1 1 8m27s kubeflow replicaset.apps/ml-pipeline-viewer-crd-bcbf785bf 0 0 0 16h kubeflow replicaset.apps/ml-pipeline-visualizationserver-5688958645 0 0 0 17h kubeflow replicaset.apps/ml-pipeline-visualizationserver-59b6d459f4 0 0 0 18h kubeflow replicaset.apps/ml-pipeline-visualizationserver-5cb965b759 0 0 0 106m kubeflow replicaset.apps/ml-pipeline-visualizationserver-78b8ff7b8 1 1 1 8m27s kubeflow replicaset.apps/ml-pipeline-visualizationserver-85ddfb48fd 0 0 0 16h kubeflow replicaset.apps/ml-pipeline-visualizationserver-c5c94b766 0 0 0 17h kubeflow replicaset.apps/notebook-controller-deployment-644b9476b4 0 0 0 18h kubeflow replicaset.apps/notebook-controller-deployment-6bb4778b97 0 0 0 17h kubeflow replicaset.apps/notebook-controller-deployment-7cf58c8fd8 0 0 0 17h kubeflow replicaset.apps/notebook-controller-deployment-bbfb45bc5 0 0 0 106m kubeflow replicaset.apps/notebook-controller-deployment-cf5cf5d6b 1 1 1 8m27s kubeflow replicaset.apps/profiles-deployment-54df459685 1 1 1 8m27s kubeflow replicaset.apps/profiles-deployment-56dc78496b 0 0 0 17h kubeflow replicaset.apps/profiles-deployment-66bd55784d 0 0 0 106m kubeflow replicaset.apps/profiles-deployment-678f85fb94 0 0 0 18h kubeflow replicaset.apps/profiles-deployment-6c5c57875d 0 0 0 18h kubeflow replicaset.apps/profiles-deployment-79fd88cf5 0 0 0 17h kubeflow replicaset.apps/tensorboard-controller-controller-manager-548766d976 1 1 1 8m26s kubeflow replicaset.apps/tensorboard-controller-controller-manager-58dcf6966b 0 0 0 17h kubeflow replicaset.apps/tensorboard-controller-controller-manager-6699d9ff77 0 0 0 106m kubeflow replicaset.apps/tensorboard-controller-controller-manager-6848cb6846 0 0 0 18h kubeflow replicaset.apps/tensorboard-controller-controller-manager-6fd5d98bc 0 0 0 17h kubeflow replicaset.apps/tensorboards-web-app-deployment-5795789fbd 0 0 0 106m kubeflow replicaset.apps/tensorboards-web-app-deployment-5bf48cbc5f 1 1 1 8m26s kubeflow replicaset.apps/tensorboards-web-app-deployment-5fd9dbfc86 0 0 0 17h kubeflow replicaset.apps/tensorboards-web-app-deployment-6d9b97fcf8 0 0 0 18h kubeflow replicaset.apps/tensorboards-web-app-deployment-845586df69 0 0 0 17h kubeflow replicaset.apps/training-operator-5c5d5767bb 0 0 0 17h kubeflow replicaset.apps/training-operator-645d5fb87f 0 0 0 17h kubeflow replicaset.apps/training-operator-6bdd889477 1 1 1 8m26s kubeflow replicaset.apps/training-operator-6bfc7b8d86 0 0 0 18h kubeflow replicaset.apps/training-operator-77dd7ccb4d 0 0 0 106m kubeflow replicaset.apps/volumes-web-app-deployment-56f77c598f 0 0 0 106m kubeflow replicaset.apps/volumes-web-app-deployment-597887dd67 0 0 0 18h kubeflow replicaset.apps/volumes-web-app-deployment-77fb8b69dc 0 0 0 17h kubeflow replicaset.apps/volumes-web-app-deployment-7d66d69cb4 1 1 1 8m26s kubeflow replicaset.apps/volumes-web-app-deployment-cfffcd786 0 0 0 17h kubeflow replicaset.apps/workflow-controller-5dc4865c68 0 0 0 106m kubeflow replicaset.apps/workflow-controller-677cb6b55d 0 0 0 17h kubeflow replicaset.apps/workflow-controller-68cdd68b77 0 0 0 18h kubeflow replicaset.apps/workflow-controller-6d45bb6c59 1 1 1 8m25s kubeflow replicaset.apps/workflow-controller-7f8d665fb9 0 0 0 17h main replicaset.apps/ml-pipeline-ui-artifact-575d66b4f8 0 0 0 96m main replicaset.apps/ml-pipeline-ui-artifact-5c684495 0 0 0 103m main replicaset.apps/ml-pipeline-ui-artifact-77f8dbc58c 0 0 0 17h main replicaset.apps/ml-pipeline-ui-artifact-785478fb95 0 0 0 17h main replicaset.apps/ml-pipeline-ui-artifact-7b65cb88fd 1 1 1 10m main replicaset.apps/ml-pipeline-ui-artifact-7bb7bfc45b 0 0 0 17h main replicaset.apps/ml-pipeline-ui-artifact-8669c884c8 0 0 0 15h main replicaset.apps/ml-pipeline-ui-artifact-d57bd98d7 0 0 0 18h main replicaset.apps/ml-pipeline-visualizationserver-58dbc6fcc8 0 0 0 17h main replicaset.apps/ml-pipeline-visualizationserver-65f5bfb4bf 0 0 0 18h main replicaset.apps/ml-pipeline-visualizationserver-664fc77cb9 0 0 0 17h main replicaset.apps/ml-pipeline-visualizationserver-6c74997489 0 0 0 17h main replicaset.apps/ml-pipeline-visualizationserver-794f8cdfc6 1 1 1 10m main replicaset.apps/ml-pipeline-visualizationserver-7cfdbfbccb 0 0 0 96m main replicaset.apps/ml-pipeline-visualizationserver-7f769ddb97 0 0 0 103m main replicaset.apps/ml-pipeline-visualizationserver-d74b66984 0 0 0 15h

NAMESPACE NAME READY AGE kubeflow statefulset.apps/kserve-controller-manager 1/1 18h kubeflow statefulset.apps/metacontroller 1/1 18h

NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE istio-system horizontalpodautoscaler.autoscaling/istio-ingressgateway Deployment/istio-ingressgateway 6%/80% 2 5 2 18h istio-system horizontalpodautoscaler.autoscaling/istiod-asm-1143-0 Deployment/istiod-asm-1143-0 0%/80% 2 5 2 18h knative-serving horizontalpodautoscaler.autoscaling/activator Deployment/activator 1%/100% 1 20 1 18h knative-serving horizontalpodautoscaler.autoscaling/webhook Deployment/webhook 7%/100% 1 5 1 18h

pablofiumara commented 2 years ago

I have just executed

kubectl patch service -n istio-system istio-ingressgateway --type='json' -p='[{"op": "remove", "path": "/spec/selector/service.istio.io~1canonical-revision", "value": "asm-1143-0"}]'

and now I am able to access Kubeflow dashboard. It seems that command does not apply to Kubeflow clusters

@gkcalat Apart from a cluster with Kubeflow 1.5 installed, I need to upgrade ASM for two clusters with Kubeflow 1.3 installed. How would you test that on GCP? If I try to create a new Kubeflow 1.3 cluster on GCP, GCP does not allow me to do that because Kubeflow version 1.3 was made for older Kubernetes version

gkcalat commented 2 years ago

I am glad we sorted out this issue. I am closing this for now.

As per KF 1.3 upgrade, you can read this page to get a better understanding of the ASM upgrade process and its limitations. Next, you can try deploying KF 1.5 on GKE 1.21, but with adjusted ASM version to match KF 1.3. We used ASM 1.9.3-asm.2+config2 in KF 1.3. Note that ASM installation process is different across these versions (install_asm vs asmcli).

As a side note, I would highly recommend you to upgrade Kubeflow.

pablofiumara commented 2 years ago

@gkcalat Thank you very much. While upgrading from Kubeflow 1.3 to 1.5, I am getting the following errors https://github.com/kubeflow/gcp-blueprints/issues/382