Closed littlechicks closed 3 years ago
I´ve had quite a similar problem - from what I can tell, it has to do with die Kubernetes Version, as the insecure (http) ports are deprecated/removed.
After some fiddling, I got to the point, where kube-proxy is empty (replaced by Cilium, so it´s ok) and etcd doesn´t work (tries to access it with internal IP, but is configured with public ip).
For kube-prometheus-operator, I used the following: ` kubeControllerManager: service: port: 10257 # https port targetPort: 10257 serviceMonitor: https: true # use https insecureSkipVerify: true # accept self-signed certificate serverName: 127.0.0.1 # do not search for called ip, but 127.0.0.1 as CN in certificate
kubeScheduler: service: port: 10259 targetPort: 10259 serviceMonitor: https: true insecureSkipVerify: true serverName: 127.0.0.1
prometheusOperator: hostNetwork: true `
For cilium: ` global: devices:
eth1 # internal kubeProxyReplacement: strict k8sServiceHost: k8s.domain.tld k8sServicePort: 6443 ipMasqAgent: enabled: true
etcd: enabled: false managed: false
ipam: operator: clusterPoolIPv4PodCIDR: "{{ pod_cidr }}" `
Additionally, I changed the binding for kube-controller-manager and kube-scheduler to 0.0.0.0 (works for me because of a firewall) via kubeadm --config in init-phase:
apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration kubernetesVersion: v1.19.0 controllerManager: extraArgs: bind-address: 0.0.0.0 scheduler: extraArgs: address: 0.0.0.0
Even when disabling control plane metrics the prometheus pod is still not being created.
I'm facing similar issues when deployed in Amazon EKS and managed to resolve the kube-proxy issue by editing the configmap as described above. Not ideal but I think for now I'm going to disable these monitors.
A bit of context to the secure and unsecure ports
- Kubeadm: enable the usage of the secure kube-scheduler and kube-controller-manager ports for health checks. For kube-scheduler was 10251, becomes 10259. For kube-controller-manager was 10252, becomes 10257. ([#85043](https://github.com/kubernetes/kubernetes/pull/85043), [@neolit123](https://github.com/neolit123))
also this a bit better formatted config
kubeControllerManager:
service:
port: 10257
targetPort: 10257
serviceMonitor:
https: "true"
insecureSkipVerify: "true"
kubeScheduler:
service:
targetPort: 10259
targetPort: 10259
serviceMonitor:
https: "true"
insecureSkipVerify: "true"
Thanks for advices.
And what about my problem with kubectl top pods. Only nodes metrics are showing up? And what about the problem with kubelet ? 0/0 showing up.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Exactly Same problem. Kubeadm launched vanilla cluster 1.20.1.
From what I can see my static pods (etcd
, kube-scheduler
, kube-controller-manager
) hold public IP (Node IP), so that's the endpoint to be reached by the according services trying to scrape.
However the service itself is bound to 127.0.0.1 so not accessible from outside, one would have to change the bind-address
but this looks unsecure to me.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
For etcd (and possibly others too), the regular client listen address which is bound to the node IP does expose the /metrics
endpoint, but requires authentication. I was able to successfully scrape metrics from etcd by creating a secret with the certs:
kubectl -n monitoring create secret generic etcd-client-cert --from-file=/etc/kubernetes/pki/etcd/ca.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key
And configuring the helm chart to mount and use them:
prometheus:
prometheusSpec:
secrets: ['etcd-client-cert']
kubeEtcd:
serviceMonitor:
scheme: https
insecureSkipVerify: false
serverName: localhost
caFile: /etc/prometheus/secrets/etcd-client-cert/ca.crt
certFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.crt
keyFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.key
Thanks a lot now scrape etcd metrics is fine. however testing the issue with other metrics.
Thanks a lot now scrape etcd metrics is fine. however testing the issue with other metrics.
have you been able to do it with scheduler, controller and proxy???
For kube-coltroller-manager and kube-scheduler follow https://stackoverflow.com/questions/65901186/kube-prometheus-stack-issue-scraping-metrics/66276144#66276144
For kube-proxy https://stackoverflow.com/questions/60734799/all-kubernetes-proxy-targets-down-prometheus-operator
For ETCD, this here by Foltik, https://github.com/prometheus-community/helm-charts/issues/204#issuecomment-765155883
For kubelet, depends whether it is running in container or as a process. In my case it was process (k8s on prem using kubeadm), hence it was picked automatically.
I also used the workarounds and proxy / etcd / controller / scheduler are working as expected, on the other hand it still seems less secure in cases, ClusterIPs would seem a good solution in case I am not missing something. +1 for this to be investigated further.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
For etcd (and possibly others too), the regular client listen address which is bound to the node IP does expose the
/metrics
endpoint, but requires authentication. I was able to successfully scrape metrics from etcd by creating a secret with the certs:kubectl -n monitoring create secret generic etcd-client-cert --from-file=/etc/kubernetes/pki/etcd/ca.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key
And configuring the helm chart to mount and use them:
prometheus: prometheusSpec: secrets: ['etcd-client-cert'] kubeEtcd: serviceMonitor: scheme: https insecureSkipVerify: false serverName: localhost caFile: /etc/prometheus/secrets/etcd-client-cert/ca.crt certFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.crt keyFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.key
Hello, I am a cluster deployed by kubeadm. Refer to this configuration, but it has no effect
For etcd (and possibly others too), the regular client listen address which is bound to the node IP does expose the
/metrics
endpoint, but requires authentication. I was able to successfully scrape metrics from etcd by creating a secret with the certs:kubectl -n monitoring create secret generic etcd-client-cert --from-file=/etc/kubernetes/pki/etcd/ca.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key
And configuring the helm chart to mount and use them:
prometheus: prometheusSpec: secrets: ['etcd-client-cert'] kubeEtcd: serviceMonitor: scheme: https insecureSkipVerify: false serverName: localhost caFile: /etc/prometheus/secrets/etcd-client-cert/ca.crt certFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.crt keyFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.key
If you're using kubeadm
, it has already configured etcd with --listen-metrics-urls
, which does not require certificates, and is just plain HTTP.
... Unfortunately by default it's probably listening on 127.0.0.1:2381
. To remedy that you need to ensure your ClusterConfiguration
includes something like this:
kind: ClusterConfiguration
etcd:
local:
extraArgs:
listen-metrics-urls: http://0.0.0.0:2381
If you've already provisioned your cluster, you'll need to monkey-patch that in with a kubectl edit -n kube-system cm/kubeadm-config
, and then run kubeadm upgrade node
on each control plane node that is hosting etcd.
Then you need to update your kube-prometheus-stack Helm values to include this:
kubeEtcd:
service:
port: 2381
targetPort: 2381
Thank you, @samcday
I am curious to know if it is safe in production to make my metrics listen endpoint to listen to 0.0.0.0? Or is there any production-level approach?
@MohammedNoureldin that's a tricky question to answer without more understanding of your production environment.
If you open up the metrics endpoint on machines that are directly connected to the public internet, then this is certainly a potential security risk. Metrics shouldn't be exposing anything particularly sensitive, nor should it be vulnerable to any exploits, but of course every additional system/service/code-path you expose to the cold, brutal wasteland that we call "the Internet" is inherently dangerous.
If your machines are connected together on a private network, you can firewall access to to the metrics endpoint such that ingress is only permitted from 192.168.0.0/24
(for example). In this case you could also configure each node with a listen address of that Node's private IP, but then you'd also need to configure kubeadm uniquely on each machine.
@samcday the approach you described is what I have been using.
Another approach I tried:
I didn't succeed to configure metrics endpoints to bind to interfaces of specific network. Is it at all possible? I mean to bind to something like "10.0.0.0/16", which will bind to all interfaces with IP address that satisfies this CIDR. Should such syntax work with binding AFAYK?
Mmmh, no I don't think you can bind to a subnet like that. Looking through etcd code, the metrics listen config option ends up as a regular net.Listen
call: https://github.com/etcd-io/etcd/blob/2e7ed80be246380fc50ca46aa193348a28935380/client/pkg/transport/listener.go#L109C17-L109C23
You can specify an IP address, or a hostname (which will be resolved to an IP, and is not recommended).
I'm not 100% sure about this but ultimately, at least on Linux, a socket binding can only be performed on a specific address, and that address must be known to the kernel ahead of time.
Think about it this way, if you have two machines available on 10.0.0.1
and 10.0.0.2
, the kernel needs to know where to send packets. If you were somehow able to bind to 10.0.0.0/16
, then how would the kernel know that packets destined for 10.0.0.2
from 10.0.0.1
should leave the machine on an interface, rather than delivered to the local process? :)
the approach you described is what I have been using.
If you're saying that you have a firewall in place that does not permit ingress to the metrics port except from internal networks, then that should already be sufficient and you can continue binding to 0.0.0.0
reasonably safely. If you want to be maximum paranoid, you can consider signing up for a service like Shodan and provide it with your public IP addresses. It will continually port scan those addresses and send you security notifications if new ports become publicly reachable.
@samcday thanks a lot.
If you're saying that you have a firewall in place that does not permit ingress to the metrics port except from internal networks, then that should already be sufficient and you can continue binding to 0.0.0.0 reasonably safely
Yes, I meant binding to 0.0.0.0 but preventing everything except a specific internal network using a firewall.
This issue is closed, but the issue still remains: kube-stack not being able to monitor the main components of the cluster, and the only workaround is exposing your main cluster components to 0.0.0.0 like a savage. You can say what you want, if you don't want problems and/or accidents you don't expose services on 0.0.0.0. More so when we're talking about core components of a Kubernetes cluster.
So my question is, is this being fixed? Is it possible to be fixed? Because I see there were commits to blame this on the users, as in "Certain cluster configurations can cause.. ", while this is the symptomatic if you install a vanilla kubeadm cluster and a vanilla kube-prometheus-stack (absolute default behavior).
So my question is, is this being fixed? Is it possible to be fixed?
How would you propose something like this be fixed? As you've pointed out, the vanilla/default kubeadm behaviour is to configure control-plane components as hostNetwork services that opt for a secure default of binding to 127.0.0.1
only. This is very reasonable/desirable, because kubeadm cannot (or well, not without a lot of effort for little gain) know the network conditions the control-plane is to be installed into.
The issue doesn't really lie with Prometheus/k-p-s, and not even really with kubeadm or the k8s control-plane components. Rather, this is just one of those unfortunate collisions of different competing concerns. There isn't really a "fix" beyond improving documentation.
Well as you already said, the issue was caused by kubeadm implementing a security feature. Now, I know this puts prometheus in a difficult position, but there are elegant fixes/workarounds. For example, this one. Notice his last remark:
It may be possible to deploy this proxy server as an option for kube-prometheus-stack.
This is not an issue to ignore. Right now, we have people globally setting their bind addresses of their main Kubernetes components to 0.0.0.0, so practically the security feature of kubeadm is for many a decrease in security, because of the workaround they are implementing. This means it's already too late, the wrong fix is being propagated as we speak to thousands of clusters, or even more. I would go so far to say this is a security vulnerability of the kube-prometheus-stack, because keep in mind most of them are implementing insecureSkipVerify TLS among the bind address, which makes everything so much worse.
Whatever the workaround/fix, it should to be implemented fast or the affected features should be removed, before even more clusters get "infected" with the workaround.
For example, https://github.com/prometheus-community/helm-charts/issues/1704#issuecomment-1100607982. Notice his last remark:
Yep, so this would still fall into my "improving documentation" bucket.
(Oh and I should take this moment to note that I'm not a contributor to any of the projects we've discussed thus far)
To expand and put it a bit differently: the defaults being the way they are, and the state of things being what it is, means that if you want to scrape metrics for your cluster's control-plane from inside of said cluster, you must deploy additional systems and services (which require additional decision-making for trade-offs in terms of maintainability, complexity, regulatory compliance concerns, etc) in some cases.
Since this is a very common occurence, the pathways forward ought to be documented from here, but only as a "helpful local". You're standing in a particular place and want to know where to go from here, k-ps-/Prometheus can't travel this journey with you, but k-p-s could/should point the way at least.
And again, to re-state, in many cases you don't need any extra complexities like a proxy. Example: your cluster nodes are running in an AWS VPC without a public NAT egress, and you know that the ingress VPCs/security-groups and workloads in your cluster are "trusted". Your kube-c-p components can listen on 0.0.0.0
.
Hello,
I have deployed kube-prometheus stack over Kubernetes with helm (https://github.com/helm/charts/blob/master/stable/prometheus-operator/values.yaml)
Almost things are working except:
kubelet
are not configured : it's showing0/0 up
kube-prometheus-stack-kube-controller-manager
aredown
kube-prometheus-stack-kube-etcd
aredown
kube-prometheus-stack-kube-proxy
aredown
kube-prometheus-stack-kube-scheduler
aredown
For the last four error it seem like prometheus is using the node IP instead of ClusterIP. The service for this four entries are correctly created but there is no Cluster IP assigned (see below)
Although the
values.yml
are correctly set up for using Kubernetes Services as below:The pod Labels are correctly matching:
I have no idea of why Prometheus want scrapping on the node IP instead of ClusterIP Services.... And for Kubelet Targets it's look like there is no service for kubelet ? Am I wrong ?
Thanks...
Workarrounds
1/ For the proxy down status For the Proxy I've solved it by updating configmap:
$ kubectl edit cm/kube-proxy -n kube-system
$ kubectl delete pod -l k8s-app=kube-proxy -n kube-system
2/ For the scheduler status
Moreover for the making metrics available to prometheus i have to edit /etc/kubernetes/manifests/ files by changing binding address to 0.0.0.0 and comment the --port:0.
But that's no a good thing because Scheduler is now exposed outside cluster on non Secure port.
So again, how can I achieve to make the servicemonitor working on ClusterIP like others wokring targets.... ?
NB : kubectl top pods is not working. Only kubectl tops nodes is working...
Configuration: