kubernetes-sigs / custom-metrics-apiserver

Framework for implementing custom metrics support for Kubernetes
Apache License 2.0
504 stars 176 forks source link

Should we rethink the response of listing requests? #150

Closed JorTurFer closed 1 year ago

JorTurFer commented 1 year ago

Hi to all,

As you know, we have recently faced with the issue produced by the changes in kube-client. Basically, the issue was related with returning an empty list because empty was considered as error.

This has pushed me to think about the current situation and how we expose the custom-metrics information. I'd like to reopen the discussion about what we are returning during listing operations and if we really need to return the current metrics. Currently, we are exposing custom/external metrics listing all of them or by name. I know that this is the global approach in Kubernetes for aggregated APIs, but I'm not sure if it makes sense (most probably due to my knowledge gap here). Other aggregated APIs like metrics.k8s.io supports the same actions as us, but in that case, it makes sense because all the metrics available on every the namespace/node. For example, if I query

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/*/pods" | jq

I will get a list with all the available values in the namespace and it makes sense because there eventually are pod metrics on every namespace, pod as metric is generic and it's an abstraction that we can query on every namespace. My doubts appear with custom/external metrics, for example in KEDA, we list all the metrics and expose them during listing requests. The problem is that each query is totally attached to the specific metric/namespace, I can't query

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/*" | jq

It's because this why I'm not sure about current approach. We are returning a list of metrics that can't be used for discovering anything, as we return only the metric name generated using the specific info but not the namespace. Basically, we are returning a useless list, because nothing can be done with it apart from knowing the whole list of metrics. This listing action can produce a timeout too, if the metric list is too long, printing an error without any useful information for end users. We have an endpoint which doesn't offer too much information, which is queried by the cluster/users really often, and which can produce timeouts easily in some situations. Another important point is that HPA controller doesn't use this listing endpoint at all, it uses directly the get endpoint providing namespace and metric name from the HPA.

Based on the reasons above, I think that following the same approach as metrics.k8s.io could be worth. My proposal is:

Following this approach, we could be safe from changes in discovery processes, and also avoid timeouts listing all the available metrics. I'm not totally sure about the changes required for doing this and neither if they have to be done in this repo or directly in the implementations but I wanted to share my thoughts about this.

BTW, I'm willing to contribute with the changes if we agree with change something

JorTurFer commented 1 year ago

@olivierlemasle @dgrisonnet FYI

dgrisonnet commented 1 year ago

/assign /triage accepted

JorTurFer commented 1 year ago

Hey @dgrisonnet Could you share your thoughts about this?

dgrisonnet commented 1 year ago

Hey @JorTurFer, sorry for the late reply. I need to spend some time researching this particular topic as I am not familiar with all the details. I'll will try to look into it in the upcoming weeks.

JorTurFer commented 1 year ago

No worries, I just asked to keep alive the topic 😄 No rush at all

rodrigc commented 1 year ago

@JorTurFer did you conclude what is the best workaround or fix for this problem?

With kubectl 1.26, I started getting these warnings for every kubectl command:

E0418 08:02:11.707022    2989 memcache.go:287] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 08:02:11.786098    2989 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 08:02:11.871669    2989 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 08:02:11.965455    2989 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

Downgrading to kubectl 1.25.2 eliminated these warnings for me.

I am running k8s.gcr.io/metrics-server/metrics-server:v0.6.2 in an EKS cluster v1.24.10-eks-48e63af

dgrisonnet commented 1 year ago

Metrics-server doesn't expose external metrics so that is a pretty odd error to see. What request did you make with kubectl?

rodrigc commented 1 year ago

@dgrisonnet I did not do anything fancy, I just did:

kubectl get pods --all-namespaces

Pretty much any k8s command gives the same warning to stdout:

E0418 08:02:11.707022    2989 memcache.go:287] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 08:02:11.786098    2989 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 08:02:11.871669    2989 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 08:02:11.965455    2989 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

I'm encountering the same issue reported here: https://github.com/kubernetes-sigs/metrics-server/issues/157

I'm not sure why that issue was closed, or what the resolution to that issue should be.

Downgrading kubectl to 1.25.2 eliminated the warnings for me. I don't fully understand all the pieces involved to understand the root cause of this problem.

sansmoraxz commented 1 year ago

It's not metrics server that's complaining Keda does when there's no scaled object or job defined.

rodrigc commented 1 year ago

This comment mentioned a band-aid solution of enabling hostNetwork: true in the helm chart for metrics-server: https://github.com/kubernetes-sigs/metrics-server/issues/157#issuecomment-544875105 when running on AWS EKS.

I didn't try that because I didn't know if that would break anything else.

It would be nice if I could better understand the root of the problem to know in which direction to go to fix this.

sansmoraxz commented 1 year ago

What I did was make a dummy deployment with a cron trigger ScaledObject, and set the desiredReplicas to 0. It's not a fix but at least kubectl will stop logging those warnings.

sansmoraxz commented 1 year ago

Also probably check your keda-operator kogs. Maybe something is broken in it.

rodrigc commented 1 year ago

@sansmoraxz Wow that's a weird workaround. Do you have a pointer in GitHub to an example of what you did? Would be nice to have a proper fix so that weird workarounds like this are not required.....

dgrisonnet commented 1 year ago

I'm encountering the same issue reported here: https://github.com/kubernetes-sigs/metrics-server/issues/157

This doesn't report an issue with the same API. You might still have an APIService resource around trying to serve the external.metrics.k8s.io/v1beta1, but that is not needed by metrics-server so you should be able to remove it.

Either way since this seems to be an error coming from KEDA I'd recommend asking for help in one of their channel because this is unrelated to the improvement mentioned in this issue.

rodrigc commented 1 year ago

@sansmoraxz as you suggested:

 kubectl get apiservice
E0418 13:47:31.183042   28041 memcache.go:287] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 13:47:31.261110   28041 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 13:47:31.344404   28041 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0418 13:47:31.425712   28041 memcache.go:121] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
NAME                                        SERVICE                                AVAILABLE                      AGE
v1.                                         Local                                  True                           102d
v1.acme.cert-manager.io                     Local                                  True                           23d
v1.admissionregistration.k8s.io             Local                                  True                           102d
v1.apiextensions.k8s.io                     Local                                  True                           102d
v1.apps                                     Local                                  True                           102d
v1.authentication.k8s.io                    Local                                  True                           102d
v1.authorization.k8s.io                     Local                                  True                           102d
v1.autoscaling                              Local                                  True                           102d
v1.batch                                    Local                                  True                           102d
v1.cert-manager.io                          Local                                  True                           23d
v1.certificates.k8s.io                      Local                                  True                           102d
v1.coordination.k8s.io                      Local                                  True                           102d
v1.discovery.k8s.io                         Local                                  True                           102d
v1.events.k8s.io                            Local                                  True                           102d
v1.networking.k8s.io                        Local                                  True                           102d
v1.node.k8s.io                              Local                                  True                           102d
v1.pg.percona.com                           Local                                  True                           23d
v1.policy                                   Local                                  True                           102d
v1.rbac.authorization.k8s.io                Local                                  True                           102d
v1.scheduling.k8s.io                        Local                                  True                           102d
v1.storage.k8s.io                           Local                                  True                           102d
v1alpha1.argoproj.io                        Local                                  True                           23d
v1alpha1.crd.k8s.amazonaws.com              Local                                  True                           23d
v1alpha1.elbv2.k8s.aws                      Local                                  True                           23d
v1alpha1.external-secrets.io                Local                                  True                           23d
v1alpha1.generators.external-secrets.io     Local                                  True                           23d
v1alpha1.keda.sh                            Local                                  True                           4d1h
v1alpha1.opentelemetry.io                   Local                                  True                           23d
v1beta1.batch                               Local                                  True                           102d
v1beta1.discovery.k8s.io                    Local                                  True                           102d
v1beta1.elbv2.k8s.aws                       Local                                  True                           23d
v1beta1.events.k8s.io                       Local                                  True                           102d
v1beta1.external-secrets.io                 Local                                  True                           23d
v1beta1.external.metrics.k8s.io             keda/keda-operator-metrics-apiserver   False (FailedDiscoveryCheck)   4d1h
v1beta1.flowcontrol.apiserver.k8s.io        Local                                  True                           102d
v1beta1.metrics.k8s.io                      argocd/metrics-server                  True                           18h
v1beta1.node.k8s.io                         Local                                  True                           102d
v1beta1.policy                              Local                                  True                           102d
v1beta1.postgres-operator.crunchydata.com   Local                                  True                           23d
v1beta1.storage.k8s.io                      Local                                  True                           102d
v1beta1.vpcresources.k8s.aws                Local                                  True                           33d
v1beta2.flowcontrol.apiserver.k8s.io        Local                                  True                           102d
v2.autoscaling                              Local                                  True                           102d
v2beta1.autoscaling                         Local                                  True                           102d
v2beta1.pg.percona.com                      Local                                  True                           23d
v2beta2.autoscaling                         Local                                  True                           102d

So as @dgrisonnet mentioned, this looks like a problem with keda. I will need to investigate further, since I'm not sure how this was installed, since I am also installing metrics-server via the helm chart in the argocd namespace. Sorry for the noise in this thread.

JorTurFer commented 1 year ago

@dgrisonnet , this is related with not having any ScaledObject (so, no metric is exposed) and the problem we have faced with the changes in kube-client related with the discovery client. The problem was already fixed in the upstream and it'll be propagated in next versions, but it isn't something that we can fix in KEDA. In fact, this issue refers to the other issue and it's to try to solve it exposing a fixed set of metrics always, not depending on the amount of metrics

JorTurFer commented 1 year ago

Btw, did you have time to think about this? Are you in KubeCon? Maybe we could talk about this face to face :)

dgrisonnet commented 1 year ago

@dgrisonnet , this is related with not having any ScaledObject (so, no metric is exposed) and the problem we have faced with the changes in kube-client related with the discovery client. The problem was already fixed in the upstream and it'll be propagated in next versions, but it isn't something that we can fix in KEDA.

Yeah that was also my guess, but since this issue in custom-metrics-apiserver didn't seem to be the right place to handle this support case, I was trying to re route them to the correct channels.

Btw, did you have time to think about this?

Not yet sadly :/ I don't know what's the potential impact of that change could have on the apiserver/autoscaler. But, my main concern right now is that I was thinking about leveraging this exposition behavior to use discovery in order to implement https://github.com/kubernetes-sigs/custom-metrics-apiserver/issues/70 without having to introduce a complex configuration. My idea was that we could just have users create a priority list for the metrics servers and by discovering the metrics ahead of time, route the API calls properly without needing any complex configuration.

Are you in KubeCon? Maybe we could talk about this face to face :)

I am, let's try to schedule something offline, I would be happy to talk about that in-person :)

JorTurFer commented 1 year ago

We have been talking about this offline, and we agree with next steps: We will explore the fixed list for custom/external metrics directly in here (not requiring it in the interface) and check how it works with external metrics in KEDA (I'll do this), then @dgrisonnet will do the same with custom metrics in prometheus adapter. If everything works as expected, the custom metrics api server will not require the list of custom/external metrics anymore and will expose a single resource for each API all the time.

simingweng commented 1 year ago

It seems to have been fixed in client-go

JorTurFer commented 1 year ago

It seems to have been fixed in client-go

Yeah, the problem was solved in Kubernetes repo, but we have discussed not treating metrics as resources. The PoC on going isn't only for fixing that problem, it's also to fix other problems related with huge metrics in the list