GoogleCloudPlatform / llm-pipeline-examples

Apache License 2.0
105 stars 25 forks source link

GKE cluster creation fails #43

Closed abdallag closed 1 year ago

abdallag commented 1 year ago

Errors:

rovisioning cluster... 2023-06-14 15:32:09.488 PDT Fetching cluster endpoint and auth data. 2023-06-14 15:32:09.733 PDT kubeconfig entry generated for gke-inference-cluster-gke. 2023-06-14 15:32:09.952 PDT Deploying predict image to cluster 2023-06-14 15:32:13.579 PDT E0614 22:32:13.577621 789 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request 2023-06-14 15:32:13.637 PDT E0614 22:32:13.635635 789 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request 2023-06-14 15:32:13.874 PDT deployment.apps/flan-t5-base-deployment configured 2023-06-14 15:32:13.906 PDT service/flan-t5-base unchanged 2023-06-14 15:32:14.034 PDT E0614 22:32:14.032452 817 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request 2023-06-14 15:32:14.043 PDT E0614 22:32:14.041722 817 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request 2023-06-14 15:32:14.048 PDT E0614 22:32:14.046914 817 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request 2023-06-14 15:32:14.053 PDT E0614 22:32:14.051580 817 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request 2023-06-14 15:32:14.097 PDT error: no matching resources found 2023-06-14 15:32:14.112 PDT Container called exit(1).

Chris113113 commented 1 year ago

The resource list logs are a red herring caused by some firewall being present. The cluster itself spun up successfully and was able to be deployed to, it's just the kubernetes/watch api that is failing to track progress.