kubernetes-sigs / metrics-server

Scalable and efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
Apache License 2.0
5.8k stars 1.87k forks source link

[feat]: Use port `10250` on the Metrics Server pod to use existing firewall rules #1026

Closed stevehipwell closed 2 years ago

stevehipwell commented 2 years ago

What would you like to be added: If we change the default port for the Metrics Server pod to 10250 then it would be covered by default firewall rules and work on most clusters out of the box. This is exactly what projects such as Cert Manager and Prometheus Operator do with their web hooks to make them just work.

The general pattern for cluster to pod communication seems to be to use port 443 where possible, so Service, and to use port 10250 where that isn't possible. The exception to this rule is where the host network needs to be used, in which case the port needs changing and the cluster firewall rules need modifying to support the new port.

For the Helm chart this would mean changing the default containerPort value to 10250.

I have tested this on an EKS cluster and can confirm that it functions correctly.

Why is this needed: This would make Metrics Server just work on most Kubernetes clusters without any additional configuration.

I also suspect that the removal of support for running the pod with port 443 is why the Metrics Server version in use is often far older than expected.

/kind feature

stevehipwell commented 2 years ago

CC @serathius

serathius commented 2 years ago

This is what we did in https://github.com/kubernetes/kubernetes/pull/105957 so +1 from me.

Making MS configuration simpler by reducing ports needed to be enabled is definitely a good goal.

stevehipwell commented 2 years ago

@serathius are you happy if I open a PR to do this?

serathius commented 2 years ago

Sure, please be sure to:

stevehipwell commented 2 years ago

So just for clarity, I'm expecting to change the default value of --secure-port to 10250 but not make any other changes to the Go code. In the manifests and Helm chart I'll be changing Metrics server to use 10250 for --secure-port and the deployment port but I'll be leaving the service port as 443, which is in line with the other users of this pattern.

serathius commented 2 years ago

No changing Go code sgtm, however I was thinking about checking Service port to 10250. Reason is that depending on configuration apiserver might want to communicate using service port instead of container port, and depend on kube-proxy to do the loadbalancing.

stevehipwell commented 2 years ago

@serathius I'm pretty sure that 443 will be open by default in most if not all clusters, for example GKE has it open for Metrics Server. This is how the other projects operate and they are activly using the service to load balance the requests from the control plane.

serathius commented 2 years ago

I'm not sure if we can about what crazy things people do, so definitely don't think that people have port 443 opened. I also would be worried about assuming that people follow GKE as example, got burned on this to many times (FYI I work on GKE :P). There is a surprising amount of people that setup Kubernetes the hard way and expect everything to work out of the box (just read https://github.com/kubernetes-sigs/metrics-server/blob/master/KNOWN_ISSUES.md :P)

My hope with that change that we can not only change, but also reduce amount of ports we require Metrics Server to run.

stevehipwell commented 2 years ago

I'm not sure if we can about what crazy things people do, so definitely don't think that people have port 443 opened. I also would be worried about assuming that people follow GKE as example, got burned on this to many times (FYI I work on GKE :P).

@serathius I agree with the sentiment but does port 443 need to be open anywhere to use it on a Service? My mental model has kube-proxy translate the call to the service kube-system/metrics-server:443 to an EndpointSlice which uses the container port (which is why services can always use low ports)?

serathius commented 2 years ago

My understanding is (definetly not k8s network expert), is that kube-system/metrics-server service is translated into domain metrics-server.kube-system (or something like that). Domain is then resolved by kube-dns to service IP, which is separate IP pool than container one. So the request is send by application to Service IP, only when request gets to host VM, IP forwarding rules controlled by kube-proxy, translate the Service IP to container IP allowing loadbalancing between multiple containers.

stevehipwell commented 2 years ago

I'm pretty sure that a request to the metrics server service would be resolved to metrics-server.kube-system.svc.cluster.local:443 and when the request is made it would be intercepted by iptable, IPVS or eBPF rules which would redirect the traffic to a backend on the container port (iptables reference). Assuming that this is correct the service port doesn't matter as it's only used for discovery from the source node.

jtgorny commented 1 year ago

This port is in used by the kubelet? That's why it's already open. Now with setting the containerPort to 10250, metrics-server startup fails with panic: failed to create listener: failed to listen on 0.0.0.0:10250: listen tcp 0.0.0.0:10250: bind: address already in use.

image

Solution:

stevehipwell commented 1 year ago

@jtgorny port 10250 is only in use if you're running Metrics Server on the host network?

artem-zinnatullin commented 1 year ago

10250 will be in use for small k8s installs (k0s, k3s) when control nodes also run as workers, I'm not sure it's such a great idea to have clashing ports between kubelet and metrics-server in case metrics-server needs to run with hostNetwork: true

stevehipwell commented 1 year ago

@artem-zinnatullin this is a common pattern across the Kubernetes ecosystem. Using host network is non-standard in this context so I don't think it's much extra complexity to set a second value to change the port. This could be automated when host network is set but that might make documentation harder.