Open epinzur opened 6 years ago
Hey @rltvty, thanks for opening this issue.
The first thing that comes to my mind is that the kube-dns deployment is exposing several ports and the /metrics is not available on 10055.
Now, in our discovery, we get the highest port value: https://github.com/DataDog/datadog-agent/blob/master/pkg/collector/autodiscovery/configresolver.go#L320-L341 But this can be configured in the template (with %%port_0%% or which ever port you want the agent to listen to). As documented here: https://docs.datadoghq.com/agent/autodiscovery/#template-variable-indexes The template we use to autodiscover is in the auto_conf folder, but you can supersed it by using the annotations in the deployment of kubedns so the agent knows which port to listen to.
Looking at available manifests it seems that the highest port is the 10055 and it's the one exposing the metrics. So the above lead might not be correct.
If you confirm that there is only one port and that you can curl the 10.101.56.17:10055/metrics
or 10.101.56.13:10055/metrics
from inside the agent's pod then please send over a flare to our support team so we can better investigate.
Secondly, I can see that you are running Kubernetes 1.6.8. Unfortunately, the agent 6 only works with kubernetes1.7.6+ As we rely on the /metrics of the kubelet to collect the kubernetes metrics. This new endpoint was first introduced in 1.7.6.
To this extent, the kubelet integration, even if the agent is provided with the kubelet ip via the downward API as an env var, will not work.
On a side note, thank you for testing the rc versions of our agent! If you can share more feedback on the issues you had with the 6.0 or getting the logs that would be fantastic too.
Adding to this as I am running into the same issue with AKS. After some digging and work with datadog support, AKS has kube DNS running under port 10053, and annotations dont seem to be working to have the agent kube dns config updated. Currently no fix.
I believe that I have run into the same problem as @rltvty, also on AKS, but with Kubernetes v1.10.6.
$ kubectl get pod -l 'k8s-app=kube-dns' -n kube-system -o custom-columns=port:.spec.containers[*].ports
port
[map[protocol:UDP containerPort:10053 name:dns-local] map[containerPort:10053 name:dns-tcp-local protocol:TCP]],[map[containerPort:53 name:dns protocol:UDP] map[name:dns-tcp protocol:TCP containerPort:53]],[map[containerPort:8080 protocol:TCP]]
[map[containerPort:10053 name:dns-local protocol:UDP] map[containerPort:10053 name:dns-tcp-local protocol:TCP]],[map[containerPort:53 name:dns protocol:UDP] map[name:dns-tcp protocol:TCP containerPort:53]],[map[containerPort:8080 protocol:TCP]]
Azure AKS' stock kube-dns doesn't resemble https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/deployments/kube-dns.yaml. Container port 10053 is the highest numbered listening port, and it's handling DNS requests. Container port 8080 is the HTTP /healthz endpoint. I believe there is no /metrics endpoint on this kube-dns.
describe-deployment-kube-dns-v20-aks.txt
describe-pod-kube-dns-v20-aks.txt
I wonder whether the Kubernetes server version matters here, compared to the revision of deployment.apps/kube-dns-v20
. Or whether Azure has tweaked something different from what datadog-agent expects.
Output of the info page (if this is a bug)
Describe what happened:
Error is getting logged when doing running the kube_dns check. We have about 80 nodes in our cluster, and each pod in the daemonset is logging this about 10 times a minute. (The error seems to pop twice on each attempt).
Describe what you expected:
Check to work without issue.
Steps to reproduce the issue:
Additional environment details (Operating System, Cloud provider, etc): Running Kubernetes 1.6.8 on AWS EC2 CoreOS 1576.5.0 instances, Docker version 17.09.0-ce, build afdb6d4, using agent Agent (v6.1.0-rc.2)
The v6.1.0 release candidates are the first agent 6's that run without constantly crashing on our cluster. Though I did need to disable log collection, and I never tried that on any of the 6.0 or 6.beta releases.