kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.13k stars 1.02k forks source link

KEDA might break existing deployment on cluster which already has another External Metrics Adapter installed #470

Open zroubalik opened 4 years ago

zroubalik commented 4 years ago

KEDA is using metrics adapter based on custom-metrics-apiserver library. As part of the deployment, user need to specify cluster wide APIService object named v1beta1.external.metrics.k8s.io, see in the library example and in KEDA deployment .

I wonder what would happen, if user has already deployed another Metrics Adapter (which is using the same APIService based approach) and we try to install Keda. It will probably replace the original APIService definition with KEDA one, so KEDA will work, but the original stuff installed on cluster probably not. We should not break things or should make clear, that this could happen.

We should investigate what are the possibilities and whether there are a better solutions on how to deal with the metrics. Or my assumptions are wrong, so please correct me in this case.

zroubalik commented 4 years ago

this should probably have 'needs-discussion' label, I am not able to assign it.

jeffhollan commented 4 years ago

@zroubalik I’ll make sure you get label permissions :). Adding needs-discussion and help-wanted in case someone gets change to validate if it is a bug

Aarthisk commented 4 years ago

@jeffhollan I am pretty sure this is by design right now. I will take a look at what options are available to chain metric servers.

zroubalik commented 4 years ago

@Aarthisk I am planning to look at other options as well.

markusthoemmes commented 4 years ago

This is a known limitation of the custom/external metrics API. A possible solution is to come up with an aggregation API as per https://github.com/kubernetes-incubator/custom-metrics-apiserver/issues/3. Knative's HPA support suffers from the same limitation.

zroubalik commented 4 years ago

@markusthoemmes thanks for the info

tomkerkhove commented 4 years ago

What's the status here, not sure if we can do anything about this?

zroubalik commented 4 years ago

We should keep it open until it gets resolved in https://github.com/kubernetes-sigs/custom-metrics-apiserver/

v-yarotsky commented 3 years ago

Ran into this because datadog helm chart also creates the APIService object with the same name of v1beta1.external.metrics.k8s.io

hinling-sonder commented 3 years ago

We ran into the same situation as @v-yarotsky mentioned. We have datadog installed with datadog-cluster-agent-metrics-api:

➜  ~ kubectl get apiservice | grep external.metrics                                                                                                              
v1beta1.external.metrics.k8s.io             default/datadog-cluster-agent-metrics-api

It does not overwrite or break existing v1beta1.external.metrics.k8s.io APIService. It just won't install keda and complains about:

Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. Unable to continue with update: APIService "v1beta1.external.metrics.k8s.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "keda": current value is "datadog"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "keda": current value is "default"
tomkerkhove commented 3 years ago

That issue might be related to https://github.com/kedacore/charts/issues/88

zroubalik commented 3 years ago

@tomkerkhove unfortunately it is not. Looking at that issue and the last error message from the screenshot, it is missing some helm labels, so the validation is failing.

hinling-sonder commented 3 years ago

I did naively try to remove the 24-metrics-apiservice.yaml chart to work around the problem but of course then you run into another problem with v1beta1.external.metrics.k8s.io routing traffic to default/datadog-cluster-agent-metrics-api not keda-operator-metrics-apiserver and HPA failed to retrieve the redis metrics...

ScalingActive  False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric hinling/redis-mailers/&LabelSelector{MatchLabels:map[string]string{scaledObjectName: redis-scaledobject,},MatchExpressions:[]LabelSelectorRequirement{},}: no metrics returned from external metrics API

I did see you @zroubalik have this issue: https://github.com/kubernetes-sigs/custom-metrics-apiserver/issues/70 proposed. If you have a branch/fork with this fix, we are more than happy to try it out. We are also happy to help with implementation.

zroubalik commented 3 years ago

@hinling-sonder it is still just a proposal, I should have start working on this in a very near future. But to get it working, a change on Datadog side will be needed as well.

hbouissoumer commented 3 years ago

hello all,

we are also experiencing the same issue here, we are using kube-state-metrics as a metric provider for our scaling strategy and they keep overriding the v1beta1.external.metrics.k8s.io APIService; it has to be either one or the other, i would be glad to help if i can!

tomkerkhove commented 3 years ago

We are definately aware of this and that it's a pain point, sorry! We have a very smart guy in our team who will look into a POC for contributing this upstream.

NasAmin commented 3 years ago

@tomkerkhove Is there any update on this please? We have the same issue and would really appreciate a solution. Thanks!

pesarkhobeee commented 2 years ago

We are using k8s-cloudwatch-adapter which already deprecated, and we are planing for the migration to Keda. As expected both of them are using the same adapter, therefore, we will appreciate any solution for a smooth migration.

roni-frantchi commented 2 years ago

I'm wondering - what were to happen if this resource was "shared" between KEDA and say, datadog?
Is it just a matter of providing a way for KEDA not to install this resource if it exist and use it? Or will there be other side effectts expected? @zroubalik ?

zroubalik commented 2 years ago

@roni-frantchi the resource needs to implement a specific interface to provide metrics for k8s server. And that implementation is specific to KEDA or Datatog or any other tool that would like to somehow provide metrics. Therefore it it can't be shared.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

GrigorievNick commented 2 years ago

I miss understand, why do not make Service API name configurable?

cforce commented 2 years ago

Indeed sounds like a solution that could allow running the metrics independent for each of them.

zroubalik commented 2 years ago

The name v1beta1.external.metrics.k8s.io is reserved by Kubernetes and is expected that this enpoint provides external metrics. And this is a cluster wide object, so there could be just one per cluster.

GrigorievNick commented 2 years ago

Hi, @zroubalik thx for your answer. But can you help me to find the source of this information, please?

I check https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/ docs, about how to add a new k8s Extension API. And I don't find any rules from k8s about the name of your API to provide external metrics. The only thing which I see, that there is already existed HPA controller from k8s, which by default has integration with

The common use for HorizontalPodAutoscaler is to configure it to fetch metrics from aggregated APIs (metrics.k8s.io, custom.metrics.k8s.io, or external.metrics.k8s.io). The metrics.k8s.io API is usually provided by an add-on named Metrics Server, which needs to be launched separately. For more information about resource metrics, see Metrics Server.

Because it works with first(default/common) Metrics server.

Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and Vertical Pod Autoscaler. Metrics API can also be accessed by kubectl top, making it easier to debug autoscaling pipelines.

The k8s API itself does not have any restrictions or even community decision that all External Metric Server must be registered by this Service API name.

Summary: So from my understanding, this is reasonable only when you want to create External Metric Server which will be compatible with HPA and VPA. If you built a completely separate solution with your own controller. And as I understand KEDA is an analog of HPA, you can use your own API name.

P.S. So the only way when KEDA may want use register its own metrics collection under this Service API name is if KEDA wants to make possible use of collected events/metrics for other products which compatible with the default Metric Server. This makes sense if KEDA was a metric system like Prometheus, Ganglia, Graphite, Datadog, etc. But KEDA is one more controller. So I think it's more than reasonable to have its own Service API endpoint for her custom events. And even have the ability to support more than one(external to KEDA) external metric Service Endpoint to have the ability to work with other metrics/event sources.

zroubalik commented 2 years ago

And as I understand KEDA is an analog of HPA, you can use your own API name.

KEDA is interally using HPA and providing metrics to HPA throught the v1beta1.external.metrics.k8s.io endpoint.

cforce commented 2 years ago

Is sharing of the resource (by not require to deploy it or skip if managed by other deployment) an option. The namespace for metrics shall be one parameter as well. IF not then support custom dedicated metrics service besides v1beta1.external.metrics.k8s.io.

zroubalik commented 2 years ago

@cforce I am not sure I fully understand your point. Behind the APIService there have to be an actual Deployment with a Metric Server that provides metrics and implementation of such Metric Service vary. KEDA Metrics Server has a different implementation than Foo Metrics Server. You cannot have mutliple different Metric Servers behind one APIService.

cforce commented 2 years ago

1.) Datadog could provide the metrics for keda if DD has keda integration. Keda needs to allow to not force deploy its metrics server as a mandatory dep 2.) The cluster Metrics server concept shall allow to register independent providers which implement the common metrics server provider interface. DD and Keda can register each of them to implement individual metrics

tomkerkhove commented 2 years ago

1.) Datadog could provide the metrics for keda if DD has keda integration. Keda needs to allow to not force deploy its metrics server as a mandatory dep

We have a scaler for datadog, but that's maybe not the point here :)

benjaminwood commented 2 years ago

I believe this issue over at the datadog-agent repo is related: https://github.com/DataDog/datadog-agent/issues/10764

We're running a cluster that is using both the datadog agent as well as Keda with some datadog triggers and have run into the problems described in the datadog-agent issue linked above. It looks like:

2022-02-02 22:02:10 UTC | CLUSTER | ERROR | (app/app.go:292 in start) | Could not start admission controller: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
tomkerkhove commented 2 years ago

Unfortunately that's not possible and the last metric server will overwrite the already existing one - You will have to choose between Datadog or KEDA.

JorTurFer commented 2 years ago

The option that you could choose is using KEDA with Datadog Scaler

alxgruU commented 2 years ago

trying to deploy keda on a k8s cluster with existing datadog deployment :

 helm install keda kedacore/keda --namespace keda

Error: rendered manifests contain a resource that already exists. Unable to continue with install: APIService "v1beta1.external.metrics.k8s.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "keda": current value is "datadog-agent"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "keda": current value is "default"

and same error when trying to deploy datadog on a cluster with existing keda deployment:

rendered manifests contain a resource that already exists. Unable to continue with install: APIService "v1beta1.external.metrics.k8s.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "datadog-agent": current value is "keda"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "default": current value is "keda"

is there a workaround for this?

zroubalik commented 2 years ago

@alxgruU as mentioned above, you cannot have both KEDA and another service that is using APIService "v1beta1.external.metrics.k8s.io", in your case Datadog.

slv-306 commented 2 years ago

Team I'm facing the below error. What is workaround? Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: APIService "v1beta1.external.metrics.k8s.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "keda"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "default"

JorTurFer commented 2 years ago

hi @slv-306 When do that happen? Are you installing KEDA for first time and you have another metrics server or similar?

slv-306 commented 2 years ago

We have another metrics-server in place. Is there any workaround. By default we have custom-metrics/custom-metrics-stackdriver-adapter in place.

JorTurFer commented 2 years ago

if the metrics server that you already have uses v1beta1.external.metrics.k8s.io (and the error suggests that), there is not any workaround, sorry. The limitation is at k8s level, we can't provide any solution from KEDA side

slv-306 commented 2 years ago

Is there any way to push metrics to stackdriver with Keda

tomkerkhove commented 2 years ago

No, unfortunately not.

slv-306 commented 2 years ago

Can we use two triggers for autoscaling single deployment. Like one trigger for CPU while other for RPS

JorTurFer commented 2 years ago

do you mean using KEDA? Yes, with KEDA you can use all the triggers that you want under the same Scaled{Job|Object} for the same workload. If you meant using your current system and also KEDA, not, it's not possible

tomkerkhove commented 2 years ago

@slv-306 As @JorTurFer mentioned this is supported and documented in our FAQ: https://keda.sh/docs/latest/faq/

May I ask you to create a GitHub Discussion for asking questions that are not related to this issue please? This helps us keep the conversation more focused - Thank you!

tomkerkhove commented 1 year ago

FAQ update: https://github.com/kedacore/keda-docs/pull/950

devmanuelgonzalez commented 11 months ago

Hello guys, I managed to workaround this disabling the metricsProvider from Datadog. In this way you can deploy both charts in your cluster. Keep in mind that the Datadog metricProvider is for Datadog to autoscale using Custom Metrics.

https://docs.datadoghq.com/containers/guide/cluster_agent_autoscaling_metrics/

JorTurFer commented 11 months ago

We are working on a proposal to fix this limitation directly in k8s, but we are still drafting the KEP. I hope that during next months it can be fixed, but currently, thanks for your workaround!

yesjinu commented 7 months ago

Hi guys! Just for those who still struggle to fix this issue (integrating Datadog with Keda)

I finally made it to install datadog in my GKE with clusterAgent.metricsProvider.enabled=false

You may not need metricsProvider to be enabled as it's only used for datadog auto-scale.

JorTurFer commented 7 months ago

About this issue, we are working on a KEP to support multiple metrics servers natively in Kubernetes 😋 https://github.com/kubernetes/enhancements/pull/4262

zroubalik commented 7 months ago

Hi guys! Just for those who still struggle to fix this issue (integrating Datadog with Keda)

I finally made it to install datadog in my GKE with clusterAgent.metricsProvider.enabled=false

You may not need metricsProvider to be enabled as it's only used for datadog auto-scale.

This might be a good candidate for documentation, Troubleshooting/FAQ section. @JorTurFer WDYT?