Closed philips closed 5 years ago
On a hunch I enabled the GKE Workload Identity service on my cluster and now I am getting
level=info ts=2019-08-04T04:51:33.271865429Z caller=main.go:296 msg="Starting Stackdriver Prometheus sidecar" version="(version=HEAD, branch=master, revision=453838cff46ee8a17f7675696a97256475bb39e7)"
level=info ts=2019-08-04T04:51:33.272237734Z caller=main.go:297 build_context="(go=go1.12, user=kbuilder@kokoro-gcp-ubuntu-prod-1535194210, date=20190520-14:47:15)"
level=info ts=2019-08-04T04:51:33.272354147Z caller=main.go:298 host_details="(Linux 4.14.127+ #1 SMP Tue Jun 18 23:08:40 PDT 2019 x86_64 prometheus-prometheus-0 (none))"
level=info ts=2019-08-04T04:51:33.272482549Z caller=main.go:299 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-08-04T04:51:33.285847047Z caller=main.go:564 msg="Web server started"
level=info ts=2019-08-04T04:51:33.286903601Z caller=main.go:545 msg="Stackdriver client started"
level=info ts=2019-08-04T04:52:36.290058215Z caller=manager.go:153 component="Prometheus reader" msg="Starting Prometheus reader..."
level=info ts=2019-08-04T04:52:36.319815836Z caller=manager.go:215 component="Prometheus reader" msg="reached first record after start offset" start_offset=0 skipped_records=0
level=warn ts=2019-08-04T04:52:37.962598185Z caller=queue_manager.go:546 component=queue_manager msg="Unrecoverable error sending samples to remote storage" err="rpc error: code = Unauthenticated desc = Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project."
@philips thanks for the report and for the extra information about Workload Identity. In both cases I see credential errors in the logs you posted.
The first error indicates that the service account doesn't have the right permissions. See the instructions here on how to set it up correctly: https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster#use_least_privilege_sa
The second error indicates that the Stackdriver Prometheus integration cannot find credentials using Application Default Credentials. If the link above doesn't help you solve this issue, please see https://cloud.google.com/docs/authentication/production
I also see that Stackdriver may use the node's service account while GKE Workload Identity Service is in beta, but I'm not sure whether it applies to the Prometheus integration, so something to keep in mind: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#limitations
@jkohen Thanks for your help.
With fresh eyes this morning I noticed the project and the cluster name were inversed. ::facepalm::
After I fixed that everything works as expected.
I will close this but it would be really cool if there were a way for the application to know the difference between having incorrect permissions and incorrect configuration. Failing that it might be good to have a debug FAQ that addresses an IAM misconfiguration looks identical to typos of the flags.
Thanks!
refer: GKE workload identity
export GCP_PROJECT=my-project
export GCP_SA=gke-prometheus
export K8S_SA=prometheus
export K8S_NS=prometheus
gcloud iam service-accounts create ${GCP_SA} --display-name=${GCP_SA}
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:${GCP_PROJECT}.svc.id.goog[${K8S_NS}/${K8S_SA}]" \
${GCP_SA}@${GCP_PROJECT}.iam.gserviceaccount.com
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member "serviceAccount:${GCP_SA}@${GCP_PROJECT}.iam.gserviceaccount.com" \
--role roles/monitoring.metricWriter
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member "serviceAccount:${GCP_SA}@${GCP_PROJECT}.iam.gserviceaccount.com" \
--role roles/monitoring.viewer
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member "serviceAccount:${GCP_SA}@${GCP_PROJECT}.iam.gserviceaccount.com" \
--role roles/logging.logWriter
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member "serviceAccount:${GCP_SA}@${GCP_PROJECT}.iam.gserviceaccount.com" \
--role roles/stackdriver.resourceMetadata.writer
kubectl annotate serviceaccount ${K8S_SA} \
iam.gke.io/gcp-service-account="${GCP_SA}@${GCP_PROJECT}.iam.gserviceaccount.com" \
-n ${K8S_NS}
I followed these steps to setup my Prometheus + Stackdriver stack.