Open rahanar opened 4 years ago
Thanks for the detailed bug report. I would suggest we add the changes you propose first and then think about a more robust solution.
I'm surprised the HPA controller is asking for metrics of old pods, even if the metrics server is storing them it seems like the wrong behavior to get those. I would have to check how the official "metrics-server" behaves for CPU and Memory metrics because I also believe it would have similar TTLs.
Thanks @mikkeloscar for your reply! I will create a PR for this.
I looked into why HPA was getting old metrics. In my case, at least, it was querying using labels. So, kube-metrics-adapter was correctly returning objects that matched the labels. Whether they should be there or not is the issue here.
I have not looked into metrics-server, but will check it out after I prepare a PR.
Hello,
Just providing an update on the issue and the fix.
I am currently working on an approval from my employer to contribute to this project. As soon as that's done, I will make a PR with the fix. If anyone else wants to do it, feel free to create a PR.
Thanks, Anar
Thanks for the update, did you have a chance to look at how the metrics-server handles this issue?
Hello!! Any idea when this problem will be solved? I am experiencing the same issue using sqs queue messages number.
Thanks, Lucas
@lsievert can you provide more data, please?
Following up on the original issue. The metrics-server has a list of active pods and only serve metrics for those pods: https://github.com/kubernetes-sigs/metrics-server/blob/2a1d1385123b9b7eed36f8b3efe8b78175db5b28/pkg/api/pod.go#L77-L124
We could do something similar to avoid this issue.
Hello, when do you plan to release these changes?
@ermakov-oleg Hi, we're just testing another feature for ScheduledScaling and will release it all together. We aim to make the release next week.
@ermakov-oleg as Mikkel said we plan to release it soon, although, if you want you can test the pre-release version with the same image version we are running in our clusters:
registry.opensource.zalan.do/teapot/kube-metrics-adapter:v0.1.10-56-g6f9aba8
Thank you!
I already looked at the diff with the latest version and used v0.1.10-68-g85f6dda
, everything works fine on this one.
I am experiencing the same issue at the newest kube-metrics-adapter:v0.1.17-6-g9d8359b
version. @jonathanbeber @mikkeloscar can you show me merged PR's with possible fixes? It seems like the only way to fix it is do not expose old metrics to HPA controller.
@wildermesser I think the only thing we have here is this: https://github.com/zalando-incubator/kube-metrics-adapter/pull/309
@mikkeloscar thank you! I saw these arguments before. But I could not realize its purpose. After reading this issue I can set proper values for these arguments. And it resolves this issue for me. I think a small description of all arguments in README will be useful for people like me :)
Some general guidelines for configuring these options:
--horizontal-pod-autoscaler-sync-period
is 15s
so without tweaking this value to improve decision time there isn't much purpose of collecting metrics faster than every 15s
.--metrics-ttl
option MUST be greater than the --garbage-collector-interval
unless you rely on other means of keeping metrics alive such as a more frequent collection interval.
Hello,
I have been testing kube-metrics-adapter (with pod collector enabled) and noticed that it returns stale data during specific conditions, which results in HPA misbehaving. The conditions occur when HPA scales up a deployment due to high load, then load subsides and it scales the deployment back to min-replica. However, if I increase the load again that should result in a scale-up, the HPA would not scale up because the data it is acting upon is stale. When it queries kube-metrics-adapter for custom metrics, the adapter is returning old data that contains non-existent pods and old values. This results in incorrect averages for the HPA.
I tracked it down to the in-memory cache that kube-metrics-adapter uses to track pods and metrics. The metrics have TTL of 15 mins and the GC is run every 10 mins. The issue mentioned above disappears after old metrics are cleaned up by GC. To fix the issue permanently, I modified TTL to 30s and GC to run every minute. HPA is able to act on new conditions much better. This should cover other edge cases, such as if HPA is in a middle of scale-down and there is a new load, it should able to act on it faster.
I think this is a simple solution for now that solves the issue and there is a better way of dealing with it. The cache can be updated on each run, so HPA can have an up-to-date view of the cluster without a delay of up to 1 minute.
Expected Behavior
kube-metrics-adapter doesn't return stale data to HPA that contains information about non-existent pods and their values.
Actual Behavior
kube-metrics-adapter returns stale data to HPA due to 15min TTL and 10 mins garbage collection. This causes HPA to calculate wrong averages based on custom metrics.
Steps to Reproduce the Problem
Specifications
I have a fix for this, but would like to discuss if it is appropriate. I propose to make TTL and GC run configurable and set the default values to 30s and 1m respectively. A better long term solution can be keeping the the cache up-to-date on each run.