kubernetes-sigs / metrics-server

Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
Apache License 2.0
5.72k stars 1.86k forks source link

Metrics Server Long term plan #627

Open serathius opened 3 years ago

serathius commented 3 years ago

In this issue I would like to propose a long term strategy for improving Metrics Server adaptation and collect ideas that would help us define a roadmap for next releases.

I hope that this should help to tackle larger problems and allow other contributors to take ownership over larger areas.

Opinions written here are my own.

Background

Main purpose of Metrics Server is resource utilization based autoscaling. It's the simplest autoscaling option in Kubernetes and usually the first that K8s users learn. Most k8s distribution install Metrics Server out of the box or provide it as an option. With it's popularity came major downside, default configuration that was popularized allowed only very basic autoscaling, thus allowing solutions alternative to Metrics Server to popularize.

In this document I would like to compare Metrics Server to two alternatives:

Metrics Server vs k8s-prometheus-adapter

Prometheus is CNCF project that became a very popular for monitoring containers. By deploying an k8s-prometheus-adapter metrics collected by Prometheus Server can be integrated in K8s autoscaling pipelines allowing for both resource and custom metric autoscaling. For cluster administrators maintaining both solution brings additional overhead as each one needs to be upgraded, monitored for failures, tuned for performance and scaled with cluster. As Prometheus users already utilize it for monitoring of their clusters, they have nesesery expertise. Which means that using Metrics Server is redundant and costly. Even though targeted solution like Metrics Server has clear advantages over generic monitoring solution, until we catch up on our weak points, Prometheus users will stop using Metrics Server.

Metrics Server weak points:

Metrics Server advantages:

Opportunity: By fixing Metrics Server weak points we can allow propose a zero maintenance solution that is more scalable when compared to Prometheus + k8s-prometheus-adapeter. As result Metrics Server would still make sense for heavy autoscaling autoscaling users.

Resource vs custom metric autoscaling

Due to popularity of untuned Metrics Server configuration there is a presumption that custom metrics autoscaling is always better to use. It is true that there are workloads that can benefit from it (autoscaling workers based on queue size), but resource based autoscaling should provide as good results in large majority of cases. There is a misconceptions is about how much work is needed to setup both solutions. Ensuring that application can reliably autoscale based on custom metrics requires not only building reliable monitoring solution but also requires tuning of applications. For example for web application a reliable autoscaling based on queries per second requires good understanding of how many concurrent requests can be handled by application and how much resources it will need for each type of request. As application evolve over time and regressions can be easily introduced this can lead to large inefficiencies in autoscaling.

Metrics Server weak points:

Metrics Server advantages:

Oportunity: By improving quality of autoscaling Metrics Server can provide much better user experience, thus allowing users to remain using it for majority of their workloads.

Strategy

Improve quality of out of the box configuration of Metrics Server allowing to much quality of autoscaling to alternatives, zero maintenance and easier to use solution for autoscaling.

Easy to use

When compared to any other Kubernetes application, Metrics Server has unreasonable number of dependencies that need to be configured for it to work. Requirements for Metrics Server are low level cluster configuration that developer wanting to try autoscaling is unable to change, resulting in frustration and unactionable support tickets on GitHub.

Ideas:

Quality of Autoscaling

Resource based autoscaling should provide a good default option when compared to custom metrics autoscaling. This should be achieved by improving freshness and accuracy of metrics.

Ideas:

Scalable

Default Metrics Server resources should be adjusted to work in majority of clusters configurations. We should also reduce the friction for less popular configuration by providing better documentation or separate configuration.

Ideas:

Observable

Ensure that signals from Metrics Server (logs, metrics) can be easily understood by users. Improve out of the box experience of monitoring Metrics Server by making monitoring integration easier and providing good default dashboards and alerts.

Ideas:

Reliable

Metrics Server is critical component in autoscaling pipeline, it's unavailability can lead to delayed autoscaling decisions. To make sure that users can reliably depend on autoscaling for their applications.

Ideas:

Other ideas

/cc @s-urbaniak @dgrisonnet

s-urbaniak commented 3 years ago

Agreed on the ideas expressed here :+1:

serathius commented 3 years ago

Good to hear that :P /cc @brancz for more feedback

dgrisonnet commented 3 years ago

I also very much agree with these ideas :+1:

serathius commented 3 years ago

Adding other SIG instrumentation leads for visibility and feedback /cc @logicalhan @dashpole @ehashman

ehashman commented 3 years ago

I think this is worthwhile. How do we plan to move forward on this? Do we have a process for collecting and acting on end user feedback?

serathius commented 3 years ago

I was planning to collect ideas and feedback from other SIG members and propose roadmap to give project more direction. I was also hoping to find owners for specific areas.

Idea for collecting feedback sounds really interesting. I would be interesting discussion during SIG meeting on how we can organize it for Metrics Server and other Instrumentation projects.

ehashman commented 3 years ago

I added this to the agenda for our next SIG meeting on the 12th.

brancz commented 3 years ago

I think the meeting on the 12th was cancelled as it's Thanksgiving in the US? Either way, I agree a SIG meeting would be good to discuss this.

ehashman commented 3 years ago

@brancz no, this week is on. The meeting after that is cancelled for US Thanksgiving which falls on Thu. Nov. 26.

brancz commented 3 years ago

Whops my bad. Today it is then :)

ehashman commented 3 years ago

/assign

We want to reach out to contribex to try to run a user survey.

logicalhan commented 3 years ago

Isn't the scraping resolution somewhat responsible for the ability of the metric-server to scale to 5k nodes? What do we expect increasing scraping resolution to 15s affect our supported cluster size of 5k nodes?

serathius commented 3 years ago

Metrics Server ability to scale to 5k nodes is based on it's linear resource usage that was verified by scale tests run by SIG scalability. Increasing scraping resolution should increase required resources, but should not break it's linearity. Increase in resources should be balanced by improvements in performance via planned switch to Prometheus endpoint.

Still there is a risk of lock contention, becoming a problem. So as first step we will go to 30s resolution, which is already used and tested in https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/metrics-server

Before making jump to 15s we need to define scalability boundaries correctly so that we have a good definition of what it means that Metrics Server scales to some size. With this definition we will be able to decide if improvements in Metrics Server concurrency are needed.

ehashman commented 3 years ago

@serathius to run a user survey, ContribEx suggests that we put together a list of questions and then make a form either through SurveyMonkey or Google Forms. They can then assist us in getting the word out for completing the survey.

Slack thread: https://kubernetes.slack.com/archives/C1TU9EB9S/p1605218342051300

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

serathius commented 3 years ago

/remove-lifecycle stale /lifecycle frozen

stevehipwell commented 3 years ago

@serathius PR https://github.com/kubernetes-sigs/metrics-server/pull/670 would provide an easier way to install as well as make monitoring and HA possible optional components (Helm can be configurable where the simple yaml can't be).

serathius commented 3 years ago

Hey @stevehipwell Although Helm is pretty popular solution it's separate from Kubernetes tooling (kubectl apply -f) that we cannot push on everyone that deploys Metrics Server. There is a lot of application delivery methods other than Helm and we should leave users of them behind. Yaml manifests as rough they are, they are the common language that can be used by everyone and easly adapted to their tooling.

As for easier way to install and configure MS this point is more about making it easier to adjust MS configuration to a specific cluster. I was thinking there more about adding a tool that analyse the cluster config and generate the suggested MS configuration. In that area using Helm doesn't provide much more benefit over quality documentation.

stevehipwell commented 3 years ago

@serathius I'm not saying that a Helm chart would replace the yaml (although it technically could via helm template metrics-server/metrics-server | kubectl -n kube-system apply -f -), I'm saying that a Helm chart would help both discovery and customisation without shutting the door on any other deployment tool (Helm charts can always be templated out to plain yaml and you could even add a variable to remove the Helm specific annotations helm template --set cleanTemplate=true).

Regarding discoverability, Helm charts are registered at https://artifacthub.io/ where they can be searched for from any web browser (there is a Kubernetes org plan to standardise this). This also allows charts to be discovered directly from Helm via the helm search hub metrics-server command.

Regarding customisation, nothing is a replacement for good docs so I think that fall outside this discussion. What a Helm chart brings is twofold, an idiomatic way to configure common components (e.g. it would be idiomatic to enable a Prometheus Operator service monitor via serviceMonitor.enabled: true) and the ability to template directly or at a higher level (e.g. hostNetwork: true could be automatically set if cloud: aws and aws.secondaryNetwork: true). The idiomatic argument could be made even stronger if the whole kubernetes-sigs group defined a shared baseline.

serathius commented 3 years ago

/cc @yangjunmyfm192085

dgrisonnet commented 1 year ago

/cc @olivierlemasle

k8s-triage-robot commented 7 months ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted