Closed JorTurFer closed 1 year ago
@olivierlemasle @dgrisonnet FYI
Agreed. We run Keda 2.7.1 and prometheus-adapter on our clusters and this is what a Helm upgrade command looks like while it's running:
Developers freak out because they think something is hosed....not a very good user experience.
Is this approach the correct? I mean, should we return the metrics instead of the metric type (like metrics.k8s.io does). Exposing all the metrics available as part of listing requests can produce timeouts if the amount of available metrics is huge, and can produce this annoying (and unrelated) warning in case of non metrics for scaling available.
This is the approach we are taking everywhere else for custom and external metrics so KEDA is definitely doing things properly.
That said I am unsure this is the correct approach from a Kubernetes standpoint. AFAIK we are the only one doing something like that in the Kubernetes ecosystem. Aggregated APIs usually have a known list of Resources they want to expose that rarely changes and it never happens dynamically (e.g PodMetrics/NodeMetrics). But in our case we are exposing metrics as if they were Kubernetes types and we are doing that dynamically meaning that at runtime our API may grow without having to restart the aggregated apiserver. We are essentially doing CRDs the wrong way.
In the past I questioned that approach when investigating https://github.com/kubernetes-sigs/prometheus-adapter/issues/292, which is a bug about the infamous spammy log line:
apiserver was unable to write a JSON response: http2: stream closed
My findings were that at ~34 Resources exposed by the API, this error was starting to appear, but I never got the time to further investigate what were the implication of that, but it definitely didn't sound right.
--
To answer the question of whether we could just expose the ExternalMetricValue type instead of the list of all external metrics, the answer is not in the current state of things. We are currently relying on the fact that the metric name is passed in the URL to be able to determine which metric the HPA is trying to get:
I can hardly see a way around that with the way aggregated APIs work today, but there is definitely something wrong with the current approach.
I will investigate what can be done to improve the situation, but maybe we could get a first quick win by getting the logs silenced for our APIs or moved to a higher verbosity level.
Yes, but I'm talking only about using a fixed value for listing operations. I have done this draft based on another contributor PR. If you see the changes: basically, we return a fixed slice with always the same content. This solves the timeout issue and also the 0 length issue, and it works, at least in KEDA (I have tested the change with e2e tests)
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[{"name":"scaledobjects","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]}]}
The routing based on the metric name hasn't changed, just the response of ListAllExternalMetrics
I guess it makes sense that it works because the HPA controller is most likely not caring at all about discovery. I would expect it to just send a request to the endpoint of the Metric the HPA targets.
I wonder if that change would be possible to make here. I have no clue if some users actually rely on the discovery endpoint.
I revived a related Kubernetes issue and pinged some people, let's hear what they say maybe we can find a solution that would not be breaking any behavior.
I revived a related Kubernetes issue and pinged some people, let's hear what they say maybe we can find a solution that would not be breaking any behavior.
Sure, I think this topic is complex to solve in only a few minutes, as is part of the discovery. Let's hear other opinions to solve this properly
I have no clue if some users actually rely on the discovery endpoint.
I don't think so as the endpoint itself doesn't support listing, I mean, the only allowed verb is get
. AFAIK, there isn't any way to introspect the resources just using the namespaces, as the list
verb isn't allowed, just get
.
BTW, thanks for you help 🙇
There is a specific /apis
endpoint for discovery that is available to any clients with the right certificates.
Do you mean executing kubectl get --raw "/apis/external.metrics.k8s.io"
?
I can't see any /apis
inside external.metrics.k8s.io
. I meant that even if you discover external.metrics.k8s.io
from /apis
, the response of /apis/external.metrics.k8s.io/v1beta1
is a list of available metrics with the supported verbs, and the only supported verb is get
, so the discovery has ended here, as you can't do kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespace/xxxxxx"
to list the metrics available in the namespace, like you can do with pods/nodes
, for example in an arbitrary namespace:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/keda/random-metric-name"
Error from server (NotFound): the server could not find the requested resource
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/keda/pods"
{"kind":"PodMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{},"items":[{"metadata":{"name":"keda-admission-5ff995d9b5-t7ss8","namespace":"keda","creationTimestamp":"2023-01-25T13:20:10Z","labels":{"app":"keda-admission-webhooks","name":"keda-admission-webhooks","pod-template-hash":"5ff995d9b5"}},"timestamp":"2023-01-25T13:19:28Z","window":"1m0.394s","containers":[{"name":"keda-admission-webhooks","usage":{"cpu":"173025n","memory":"16964Ki"}}]},{"metadata":{"name":"keda-metrics-apiserver-8c6db6798-fp7kc","namespace":"keda","creationTimestamp":"2023-01-25T13:20:10Z","labels":{"app":"keda-metrics-apiserver","pod-template-hash":"8c6db6798"}},"timestamp":"2023-01-25T13:19:29Z","window":"1m3.599s","containers":[{"name":"keda-metrics-apiserver","usage":{"cpu":"3448706n","memory":"43872Ki"}}]},{"metadata":{"name":"keda-operator-545fbd6565-kwxfp","namespace":"keda","creationTimestamp":"2023-01-25T13:20:10Z","labels":{"app":"keda-operator","name":"keda-operator","pod-template-hash":"545fbd6565"}},"timestamp":"2023-01-25T13:19:34Z","window":"1m12.775s","containers":[{"name":"keda-operator","usage":{"cpu":"2060106n","memory":"54264Ki"}}]}]}
This is what I meant with:
I don't think so as the endpoint itself doesn't support listing, I mean, the only allowed verb is get. AFAIK, there isn't any way to introspect the resources just using the namespaces, as the list verb isn't allowed, just get.
The metrics we are registering aren't available on all the namespaces
/triage accepted /assign @dgrisonnet
The same thing happens to me, just running a simple command...
I think we need to find a solution soon, because the amount of noise will increase with the time, helm is already affected in latest version (because they have bumped their deps)
I think we need to find a solution soon, because the amount of noise will increase with the time, https://github.com/helm/helm/issues/11772 affected in latest version (because they have bumped their deps)
I'm seeing this with plain Kubectl too.
I think we need to find a solution soon, because the amount of noise will increase with the time, helm/helm#11772 affected in latest version (because they have bumped their deps)
I'm seeing this with plain Kubectl too.
We're also seeing it with any version of kubectl
greater than v1.24 in clusters where keda
is installed
Independently of the action taken in the upstream, what do you think about exposing a fixed metric based on metric types instead of dynamically calculating the available metrics @dgrisonnet @logicalhan ?
This could also solve the timeout problem when there are multiple metrics
Sorry for being a pain but we will release KEDA v2.10 and we would like to patch this somehow. Currently we have 2 options:
Could you suggest something to us @dgrisonnet @logicalhan ?
can we keep this ticket open untill the fix is deployed and in working condition ?
Recent changes in https://github.com/kubernetes/client-go have changed the behavior of discovery client. These changes are released as part of v0.26 and produced a scenario where the tooling like kubectl or helm shows a warning during the resource discovery:
As a first action, I reviewed if we are doing something wrong in KEDA but after debugging helm and kubectl I found that the reason isn't related with KEDA (prometheus-adapter produces the same error but about custom.metrics) itself. After the changes, the discovery client list all the api resources and also all the resources inside the api resources, so the scenario where a fresh installation of KEDA/prometheus-adapter without any metric exposed automatically shows this warning on every update tooling, which is annoying (and looks hideous).
In KEDA, we return the current metrics available in response of listing requests, and I have seen that other tools like prometheus-adapter does the same for external metrics and custom metrics
Is this approach the correct? I mean, should we return the metrics instead of the metric type (like metrics.k8s.io does). Exposing all the metrics available as part of listing requests can produce timeouts if the amount of available metrics is huge, and can produce this annoying (and unrelated) warning in case of non metrics for scaling available. I can see that this approach is also used in the example code, but if this is how we should do the things exposing metrics, the discovery client should consider this to ignore those errors.
Could you provide any help/guidance about this topic. We have received some notifications about this warning in KEDA project and users think it's related with a KEDA problem, but we follow the available example to develop our metrics server.
Is correct from custom-metrics-apiserver usage if we return a fixed collection of generic types (in our case,
scaledobjects
) like metrics.k8s.io api does (with 'pods' and 'nodes'). This change would solve the timeouts in huge clusters and also the issue without any metric. Considering that these metrics aren't browsable like metric-k8s-io are, maybe this could be acceptable from custom-metrics-apiserver implementation pov