google / knative-gcp

GCP event implementations to use with Knative Eventing.
https://github.com/knative/eventing
Apache License 2.0
159 stars 75 forks source link

knative fails on private cluster because http://169.254.169.254/computeMetadata times out #2178

Closed Miles-Ahead-Digital closed 3 years ago

Miles-Ahead-Digital commented 3 years ago

Describe the bug knative fails on private cluster with workload identity because http://169.254.169.254/computeMetadata times out.

On a fresh installation knative works, but if you resize the node-pool to 0 and then back to 3 you cant access the Metadata anymore.

Expected behavior knative works

To Reproduce On the pod you can reproduce it: curl -H 'Metadata-Flavor: Google' 'http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token' curl: (7) Failed to connect to 169.254.169.254 port 80: Connection timed out

Knative-GCP release version 0.21

Additional context I noticed the prometheus-to-sd container in the kube-dns pod crashed with the same error (the same also for the event-exporter-gke pod - both in the kube system namespace).

Miles-Ahead-Digital commented 3 years ago

I rebuild the cluster and tried to reproduce the error be scaling the cluster to 0 and back to 3 nodes. But right now it still works.. The only difference is that I use Kubernetes 1.18.15-gke.1102 (instead of 1.20 before).

I close this message, but I would be happy if someone adds a procedure how to analyze the error because I have the feeling that it will occur again..

Stefan