Open rbarberop opened 1 year ago
Looks like I have the same issue:
E1205 21:46:49.429171 1 timeout.go:135] post-timeout activity - time-elapsed: 3.304368ms, GET "/apis/custom.metrics.k8s.io/v1beta2" result: <nil>
E1205 21:46:49.523578 1 wrap.go:54] timeout or abort while handling: method=GET URI="/apis/custom.metrics.k8s.io/v1beta2" audit-ID="71f72960-a980-4afb-ad1d-03e1b6bec66f"
E1205 21:46:49.523649 1 writers.go:117] apiserver was unable to write a JSON response: http2: stream closed
E1205 21:46:49.523716 1 wrap.go:54] timeout or abort while handling: method=GET URI="/apis/custom.metrics.k8s.io/v1beta2" audit-ID="79f73bf9-dc54-4676-936a-aa819a77194e"
E1205 21:46:49.525281 1 writers.go:111] apiserver was unable to close cleanly the response writer: http: Handler timeout
E1205 21:46:49.525327 1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http2: stream closed"}: http2: stream closed
E1205 21:46:49.526374 1 writers.go:130] apiserver was unable to write a fallback JSON response: http2: stream closed
E1205 21:46:49.528673 1 timeout.go:135] post-timeout activity - time-elapsed: 4.796544ms, GET "/apis/custom.metrics.k8s.io/v1beta2" result: <nil>
I have the same issue
post-timeout activity - time-elapsed: 109.781784ms, GET "/apis/custom.metrics.k8s.io/v1beta1" result: <nil>
I have a high volume of the same / similar errors That said, the adapter itself works AFAIK
Same issue, also I had to add this resource to my cluster in order to get it to startup
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: custom-metrics-extension-apiserver-authentication-reader
namespace: kube-system
subjects:
- kind: ServiceAccount
name: custom-metrics-stackdriver-adapter
namespace: custom-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
Same issue, also I had to add this resource to my cluster in order to get it to startup
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: custom-metrics-extension-apiserver-authentication-reader namespace: kube-system subjects: - kind: ServiceAccount name: custom-metrics-stackdriver-adapter namespace: custom-metrics roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader
I tried this but nothing changed. When you "get it to startup" what exactly does "it" refer to? The stackdriver adapter pod? In my case the pod is running, but it has loads of of log entries like:
E0208 17:56:12.201768 1 provider.go:320] Failed request to stackdriver api: googleapi: Error 403: Permission monitoring.metricDescriptors.list denied (or the resource may not exist)., forbidden
Maybe it's a different issue?
@PaulRudin just spitballing here, but that error looks like your cluster's service account doesn't have permission to call the GCP monitoring API.
Yeah, but nothing has changed recently as far as I know. But maybe I've inadvertently modified something when changing something unrelated.
Hi!
I have the exactly same issue and it appeared suddenly about the same period.
I thought too about a permission or scope issue but the node where the adapter is running has cloud-platform
scope and the service account has Monitoring Viewer
permission.
I also think that it is not a permission or scope issue as we have timeout errors.
I am still investigating. Maybe something has changed on GCP side.
OK - so in my case I had inadvertently changed the service account, so the permission denied problem has been fixed. But I do still see messages similar to those reported by others:
E0209 10:59:56.474287 1 timeout.go:135] post-timeout activity - time-elapsed: 15.70529ms, GET "/apis/custom.metrics.k8s.io/v1beta1" result: <nil>
E0209 10:59:56.554654 1 writers.go:111] apiserver was unable to close cleanly the response writer: http: Handler timeout
E0209 10:59:56.555141 1 writers.go:130] apiserver was unable to write a fallback JSON response: http2: stream closed
E0209 10:59:56.556250 1 writers.go:130] apiserver was unable to write a fallback JSON response: http: Handler timeout
E0209 10:59:56.558605 1 timeout.go:135] post-timeout activity - time-elapsed: 93.570353ms, GET "/apis/custom.metrics.k8s.io/v1beta1" result: <nil>
E0209 10:59:56.559753 1 timeout.go:135] post-timeout activity - time-elapsed: 102.936145ms, GET "/apis/custom.metrics.k8s.io/v1beta1" result: <nil>
E0209 10:59:56.560900 1 timeout.go:135] post-timeout activity - time-elapsed: 102.266556ms, GET "/apis/custom.metrics.k8s.io/v1beta2" result: <nil>
sorry to slightly hijack, but also curious what folks who are using Workload Identity do - do you create a GCP service account for the custom metrics adapter and bind it to the Kube service account? I assume the cluster role bindings don't provide Google API level access, and with workkload identity, I don't think the pod will implicitly have the creds of the nodepool's service account either.
https://github.com/GoogleCloudPlatform/k8s-stackdriver/issues/315 I guess kind of covers this
We're seeing the same error messages but the adapter appears to be functional. It would be nice to understand what the errors mean and what changes we need to make to reduce the noise.
@sosimon how do you test to know it is functional? are you reading a metric with kubectl ?
@matiasah I think our HPA is working. Some of the logs shown here are same as the ones in https://github.com/GoogleCloudPlatform/k8s-stackdriver/issues/510. Not the auth errors, those should be, and need to be, resolved by providing the service account the right permissions.
Hi have created a new GCP project and a GKE cluster inside it.
I've followed the instructions in the README...
However Logs Explorer is complaining about it... looks like an authentication problem?