Closed dcs3spp closed 2 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
I am having the same issue as well
Me too, but on Ubuntu 20.04, following targets are unavailable with the default settings. I haven't looked at any of them other than the controller manager yet though. Trying to find why its not working and not really wanting to do something like change the bind address on the pod.
Get "http://10.0.0.18:10252/metrics": dial tcp 10.0.0.18:10252: connect: connection refused
Get "http://10.0.0.18:2379/metrics": read tcp 10.244.11.223:38678->10.0.0.18:2379: read: connection reset by peer
Get "http://10.0.0.14:10249/metrics": dial tcp 10.0.0.14:10249: connect: connection refused
Get "http://10.0.0.18:10251/metrics": dial tcp 10.0.0.18:10251: connect: connection refused
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Still experiencing this issue....
Also experiencing this
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Still experiencing this issue...
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Still experiencing
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Still experiencing
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Still experiencing
I had the same error. If your endpoint is not accessible via http but accessible via https try this: https://github.com/prometheus-community/helm-charts/issues/204#issuecomment-765155883 that fixed it for me.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
I resolved this issue for myself today, and coincidentally saw that it was closed by the bot yesterday so decided to post. My cluster is a kubeadm standard deployment on two Docker machines (linux hosts of varying distributions)
The trick is to make sure that your four services that are in TargetDown
are actually bound and listening on addresses where Prometheus can collect their metrics. As they have evolved from earlier versions of Kubernetes, the defaults have changed and they are not configured for metrics collection by default anymore, if they were ever intended to be. This is not granted or default with kubeadm at least, and from the research I did, you would see that etcd for example requires changes to the manifests that run on the "managed" side of the cluster that may not be obvious or straightforward to apply on arbitrary managed k8s instances that don't necessarily expose you to the details of how etcd or the controller manager runs.
That being said on my cluster it was a kubeadm cluster with default configuration and I got these four TargetDown from kube-prometheus-stack chart default configuration too.
I went into the cluster masters (ok only one master, it's a home lab cluster) and edited /etc/kubernetes/manifests/etcd.yaml
which is where kubeadm
deploys its static etcd pod from, and found this block:
- --listen-client-urls=https://127.0.0.1:2379,https://10.17.12.146:2379
- --listen-metrics-urls=http://127.0.0.1:2381
Looks like I should change it to this:
- --listen-client-urls=https://127.0.0.1:2379,https://10.17.12.146:2379
- --listen-metrics-urls=http://127.0.0.1:2381,http://10.17.12.146:2381
I'm revealing internal details of my network, (the Kubernetes master node is 10.17.12.146
)
After making this change, I checked the service that was installed by kube-prometheus-stack operator:
kubectl -n kube-system edit svc kube-prometheus-stack-kube-etcd
spec:
ports:
- name: http-metrics
port: 2379
protocol: TCP
targetPort: 2379
Knowing full well that metrics don't run on 2379 since we just set the listen to port 2381, I think about changing this to match, but I realize this service was created by kube-prom-stack operator and I probably need to reconfigure something in my Helm chart instead, I added this:
commit 32e56b0d8a455db587f3f7b3e867f2bee4b7198c (HEAD -> monitoring, origin/monitoring)
Author: Kingdon Barrett <kingdon@weave.works>
Date: Mon Jan 17 13:50:52 2022 -0500
monitor kubeEtcd port where metrics are listening
diff --git a/manifests/monitoring/kube-prometheus-stack/release.yaml b/manifests/monitoring/kube-prometheus-stack/release.yaml
index b17652a..62b98a5 100644
--- a/manifests/monitoring/kube-prometheus-stack/release.yaml
+++ b/manifests/monitoring/kube-prometheus-stack/release.yaml
@@ -115,6 +115,10 @@ spec:
podMonitorSelector:
matchLabels:
app.kubernetes.io/part-of: flux
+ kubeEtcd:
+ service:
+ port: 2381
+ targetPort: 2381
postRenderers:
- kustomize:
Now TargetDown
for my etcd service stops alerting. I know I'm on the right track π
Now I realize my memory is bad and I'm telling the story out of order... editing this configmap in kube-system
namespace kube-proxy
, I set the metricsBindAddress
from ''
to 0.0.0.0
and another TargetDown
alert is laid to bed.
The remaining services are all deployed from static pods in /etc/kubernetes/manifests/
I edit kube-controller-manager.yaml
and kube-scheduler.yaml
to both reset their bind addresses, from 127.0.0.1
to 0.0.0.0
:
- --bind-address=0.0.0.0
I read some issue thread where kubeadm stated these were not reasonable defaults as the kubeadm users who care about these metrics are not in the majority, and you should not bind ports which can potentially be picked up by outsiders unless you really care about them.
(so_anyway_i_started_blasting.jpg
)
So I set the bind address to 0.0.0.0, since setting my internal address from earlier didn't seem to have any effect, and I reset the kubelet
with a sudo systemctl restart kubelet
, finally all four of my TargetDown
alerts are winding down, and the default kube-prometheus-stack
can function normally without any silences, other than the default Watchdog silence π π―
Hope this story helps someone else. I am afraid I'm not sure of a good way to set these configurations on kubeadm init
and I'll have to update my knowledge about kubeadm
next time I tear down and reset my cluster, (as these configurations should be possible without making adjustments at runtime / after the cluster has already been provisioned by kubeadm.)
Could anyone solve this issue on Docker Desktop for Mac?
Could anyone solve this issue on Docker Desktop for Mac?
I know the issue is closed & stale, but the answers contained in here are a bit fractured and not easy to piece together, so for those finding this later... working solution as of Feb 2024...
https://gist.github.com/SpoddyCoder/ff0ea39260b0d4acdb8b482532d4c1af
I resolved this issue for myself today, and coincidentally saw that it was closed by the bot yesterday so decided to post. My cluster is a kubeadm standard deployment on two Docker machines (linux hosts of varying distributions)
The trick is to make sure that your four services that are in
TargetDown
are actually bound and listening on addresses where Prometheus can collect their metrics. As they have evolved from earlier versions of Kubernetes, the defaults have changed and they are not configured for metrics collection by default anymore, if they were ever intended to be. This is not granted or default with kubeadm at least, and from the research I did, you would see that etcd for example requires changes to the manifests that run on the "managed" side of the cluster that may not be obvious or straightforward to apply on arbitrary managed k8s instances that don't necessarily expose you to the details of how etcd or the controller manager runs.That being said on my cluster it was a kubeadm cluster with default configuration and I got these four TargetDown from kube-prometheus-stack chart default configuration too.
I went into the cluster masters (ok only one master, it's a home lab cluster) and edited
/etc/kubernetes/manifests/etcd.yaml
which is wherekubeadm
deploys its static etcd pod from, and found this block:- --listen-client-urls=https://127.0.0.1:2379,https://10.17.12.146:2379 - --listen-metrics-urls=http://127.0.0.1:2381
Looks like I should change it to this:
- --listen-client-urls=https://127.0.0.1:2379,https://10.17.12.146:2379 - --listen-metrics-urls=http://127.0.0.1:2381,http://10.17.12.146:2381
I'm revealing internal details of my network, (the Kubernetes master node is
10.17.12.146
)After making this change, I checked the service that was installed by kube-prometheus-stack operator:
kubectl -n kube-system edit svc kube-prometheus-stack-kube-etcd
spec: ports: - name: http-metrics port: 2379 protocol: TCP targetPort: 2379
Knowing full well that metrics don't run on 2379 since we just set the listen to port 2381, I think about changing this to match, but I realize this service was created by kube-prom-stack operator and I probably need to reconfigure something in my Helm chart instead, I added this:
commit 32e56b0d8a455db587f3f7b3e867f2bee4b7198c (HEAD -> monitoring, origin/monitoring) Author: Kingdon Barrett <kingdon@weave.works> Date: Mon Jan 17 13:50:52 2022 -0500 monitor kubeEtcd port where metrics are listening diff --git a/manifests/monitoring/kube-prometheus-stack/release.yaml b/manifests/monitoring/kube-prometheus-stack/release.yaml index b17652a..62b98a5 100644 --- a/manifests/monitoring/kube-prometheus-stack/release.yaml +++ b/manifests/monitoring/kube-prometheus-stack/release.yaml @@ -115,6 +115,10 @@ spec: podMonitorSelector: matchLabels: app.kubernetes.io/part-of: flux + kubeEtcd: + service: + port: 2381 + targetPort: 2381 postRenderers: - kustomize:
Now
TargetDown
for my etcd service stops alerting. I know I'm on the right track πNow I realize my memory is bad and I'm telling the story out of order... editing this configmap in
kube-system
namespacekube-proxy
, I set themetricsBindAddress
from''
to0.0.0.0
and anotherTargetDown
alert is laid to bed.The remaining services are all deployed from static pods in
/etc/kubernetes/manifests/
I edit
kube-controller-manager.yaml
andkube-scheduler.yaml
to both reset their bind addresses, from127.0.0.1
to0.0.0.0
:- --bind-address=0.0.0.0
I read some issue thread where kubeadm stated these were not reasonable defaults as the kubeadm users who care about these metrics are not in the majority, and you should not bind ports which can potentially be picked up by outsiders unless you really care about them.
(
so_anyway_i_started_blasting.jpg
)So I set the bind address to 0.0.0.0, since setting my internal address from earlier didn't seem to have any effect, and I reset the
kubelet
with asudo systemctl restart kubelet
, finally all four of myTargetDown
alerts are winding down, and the defaultkube-prometheus-stack
can function normally without any silences, other than the default Watchdog silence π π―Hope this story helps someone else. I am afraid I'm not sure of a good way to set these configurations on
kubeadm init
and I'll have to update my knowledge aboutkubeadm
next time I tear down and reset my cluster, (as these configurations should be possible without making adjustments at runtime / after the cluster has already been provisioned by kubeadm.)
Based on my little knowledge This is the correct way to solve this problem. Thanks
Also asked as question on stack overflow.
What happened? The myrelease-name-prometheus-node-exporter service is failing with errors from the daemonset received after installation of the helm chart for kube-prometheus-stack is installed on Docker Desktop for Mac Kubernetes Cluster environment.
The scrape targets for
kube-scheduler:http://192.168.65.4:10251/metrics
,kube-proxy:http://192.168.65.4:10249/metrics
,kube-etcd:http://192.168.65.4:2379/metrics
,kube-controller-manager:http://192.168.65.4:10252/metrics
andnode-exporter:http://192.168.65.4:9100/metrics
are marked as unhealthy. All show asconnection refused
, except forkube-etcd
which displaysconnection reset by peer
.I have installed kube-prometheus-stack as a dependency in my helm chart on a local Docker for Mac Kubernetes cluster v1.19.7. I have also tried this on a minikube cluster using the hyperkit vm-driver, with the same result.
Chart.yaml
Values.yaml
Did you expect to see some different? All kubernetes start successfully and all scrape targets marked as healthy.
How to reproduce it (as minimally and precisely as possible): On Docker desktop for Mac OS environment install the helm chart for Kube-Prometheus-Stack v14.40 and inspect status of the aforementioned failed scrape targets and view the logs for myrelease-name-prometheus-node-exporter service pod(s).
Environment Mac OS Catalina 10.15.7 Docker Desktop For Mac 3.2.2(61853) with docker engine v20.10.5 Local Kubernetes 1.19.7 Cluster provided by Docker Desktop For Mac
Prometheus Operator version:
Insert image tag or Git SHA here
kube-prometheus-stack (https://hub.kubeapps.com/charts/prometheus-community/kube-prometheus-stack#!) 14.4.0Kubernetes version information:
kubectl version
Kubernetes cluster provided with Docker Desktop for Mac
?
release-name-prometheus-node-exporter error log
Anything else we need to know?:
kubectl get all
After updating
values.yaml
to:The prometheus-node-exporter daemonset now starts based on earlier issue fix. However the scrape targets mentioned above still remain unhealthy with
Get "http://192.168.65.4:<port_num>/metrics": dial tcp 192.168.65.4:<port_num>: connect: connection refused
error.Tried further investigation of kube-scheduler by sending a port-forward and visiting http://localhost:10251/metrics. Log output from pod is shown below:
If I run in
minikube
on macOS with vm-driver hyperkit then thenode-exporter
daemonset is successful and with:Also the
kube-proxy
scrape target appears to be available in minikube which I have verified using a port forwardkubectl -n kube-system port-forward kube-proxy-sxw8k 10249
and visitinghttp://localhost:10249/metrics
. This also appears to work with a port forward in docker-desktop cluster, but appears as failing in prometheus targets....So in minikube on macOS with hyperkit vm-driver and default helm chart values the following scrape targets are unavailable:
etcd-minikube logs
kube-controller-manager-minikube logs:
kube-scheduler-minikube logs
Questions Are these scrape targets dependent upon a successful
prometheus-node-exporter.hostRootFsMount
?How do I enable
etcd
,scheduler
,controller-manager
andkube-proxy
(docker-desktop) scrape targets with a helm chart installation of kube-prometheus-stack on macOS Kubernetes cluster running on docker desktop and minikube?Can such a fix be made to work out of the box when installation via the helm chart?