GoogleCloudPlatform / kubernetes-engine-samples

Sample applications for Google Kubernetes Engine (GKE)
https://cloud.google.com/kubernetes-engine/docs/tutorials/
Apache License 2.0
1.25k stars 1.22k forks source link

custom-metrics-autoscaling/direct-to-sd/custom-metrics-sd.yaml deployment failed #385

Closed rick-c-goog closed 2 years ago

rick-c-goog commented 2 years ago

Tried to run this direct-to-sd custom-metric HPA example: out of box failed: unable to start container process: exec: "./sd-dummy-exporter": stat ./sd-dummy-exporter: no such file or directory: unknown

I tried to build and push image to artifact registry, update the custom-metrics-sd.yaml with my own image, still experienced the same error.

NimJay commented 2 years ago

@rick-c-goog Thank you reporting this issue. Just to clarify, is this the tutorial you followed: cloud.google.com's "Optimize Pod autoscaling based on metrics"?

rick-c-goog commented 2 years ago

Yes, I believe I got the issue resolved now. File https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/main/custom-metrics-autoscaling/direct-to-sd/custom-metrics-sd.yaml

line - command: ["./sd-dummy-exporter"] should be changed to:

With changes above, the tutorial works on standard GKE. But it will fail on GKE Auto-pilot, even though I followed instruction for workload Identity related commands from https://github.com/GoogleCloudPlatform/k8s-stackdriver/tree/master/custom-metrics-stackdriver-adapter

logs from adaptor itself showing no errors, but logs from "kubectl logs po custom-metric-sd-7cb9c87cf9-p448f" shows SA permission issue while the GSA already has all the needed permission. Error: Failed to write time series data for new resource model: googleapi: Error 403: Permission monitoring.timeSeries.create denied (or the resource may not exist)., forbidden

Any idea why same example won't work on GKE Auto Pilot?

bourgeoisor commented 2 years ago

The original issue was fixed here: https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/pull/401

Let us know if you still run into an issue with GKE Autopilot!