Closed danielSundsvallSCIT closed 1 year ago
Have you checked prometheus and prometheus-operator logs?
possible that the dns servers are not serving metrics? either because the plugin is not enabled or because the pods do not have the port open?. check with kubectl port forward svc/kubedns metrics
and curl localhost:9153/metrics
(not tested)
I see the plugin enabled. Also the pods are serving metrics. However the default kubernetes coredns service is not exposing the metrics port.
.:53 {
errors
ready
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
import custom/*.override
}
kdesc svc -n kube-system kube-dns
Name: kube-dns
Namespace: kube-system
Labels: addonmanager.kubernetes.io/mode=Reconcile
k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=CoreDNS
Annotations: <none>
Selector: k8s-app=kube-dns
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.0.10
IPs: 10.0.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 10.244.6.5:53,10.244.7.37:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 10.244.6.5:53,10.244.7.37:53
Session Affinity: None
Events: <none>
curl against a coredns pod
/ # curl 10.244.6.5:9153/metrics
# HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.
# TYPE coredns_build_info gauge
coredns_build_info{goversion="go1.17",revision="a9adfd56",version="1.8.7"} 1
# HELP coredns_cache_entries The number of elements in the cache.
# TYPE coredns_cache_entries gauge
coredns_cache_entries{server="dns://:53",type="denial"} 339
coredns_cache_entries{server="dns://:53",type="success"} 63
...
This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.
This issue was closed because it has not had any activity in the last 120 days. Please reopen if you feel this is still valid.
hi @danielSundsvallSCIT
sorry a bit long time already, but have you figured out what's the issue? I'm seeing the same context deadline exceeded
but all the ping/curl/port forwarding actually works
I have been struggling with this issue a while now, and I need som guidance or tips how to proceed. Managed to scrape everything except the coredns pods.
All the pods is running
My coredns monitor and kube-dns svc
Dont see anything in the pod logs referring to this issue, I even tried to set up a new servicemonitor and got the same result.
I think it may be a network issue inside the cluster, but I wanna double check if any one can see something that I dont. And maybe point me in right direction.
We are using Cisco ACI as CNI plugin.