Open jwping opened 1 year ago
I have the same issue and can see following in the logs:
[INFO] 10.0.0.72:34141 - 11273 "A IN kube-dns.kube-system.svc.cluster.local.monitoring.svc.cluster.local. udp 85 false 512" NXDOMAIN qr,aa,rd 178 0.000441375s [INFO] 10.0.0.72:34141 - 11776 "AAAA IN kube-dns.kube-system.svc.cluster.local.monitoring.svc.cluster.local. udp 85 false 512" NXDOMAIN qr,aa,rd 178 0.000412507s [INFO] 10.0.0.72:51078 - 27467 "AAAA IN kube-dns.kube-system.svc.cluster.local.cluster.local. udp 70 false 512" NXDOMAIN qr,aa,rd 163 0.000208542s [INFO] 10.0.0.72:51078 - 26859 "A IN kube-dns.kube-system.svc.cluster.local.cluster.local. udp 70 false 512" NXDOMAIN qr,aa,rd 163 0.000176965s [INFO] 10.0.1.120:60184 - 63770 "A IN loki.monitoring.svc.cluster.local.monitoring.svc.cluster.local. udp 80 false 512" NXDOMAIN qr,aa,rd 173 0.000274916s [INFO] 10.0.1.120:50301 - 19685 "AAAA IN loki.monitoring.svc.cluster.local.svc.cluster.local. udp 69 false 512" NXDOMAIN qr,aa,rd 162 0.000092976s [INFO] 10.0.1.120:59617 - 17088 "A IN loki.monitoring.svc.cluster.local.cluster.local. udp 65 false 512" NXDOMAIN qr,aa,rd 158 0.000166104s [INFO] 10.0.1.120:60339 - 24553 "AAAA IN loki.monitoring.svc.cluster.local.damn.li. udp 59 false 512" NOERROR qr,rd,ra 143 0.000685193s [INFO] 10.0.1.120:56636 - 10429 "AAAA IN loki.monitoring.svc.cluster.local. udp 51 false 512" NXDOMAIN qr,aa,rd 144 0.000104261s
For some reason gateway is requesting a way too long domain.
@darox @jwping You have to check that you configure loki with the right dns setting.
Query the name of your kube-dns service name,
kubectl get svc --namespace=kube-system -l k8s-app=kube-dns -o jsonpath='{.items..metadata.name}'
then adjust your helm setting with the result you got, in my case the dns svc is not kube-dns but "rke2-coredns-rke2-coredns". so i use
global:
dnsService: "rke2-coredns-rke2-coredns"
and it works fine, pod start and does not complain anymore.
Could you try this again? I normally develop against a k3d
cluster, but in testing against a kind
cluster to debug some CI failures (since that's what our CI uses), I noticed some differences in the ndots
value present in the /etc/resolv.conf
in the containers on the kind
cluster. As a result I needed to add an extra dot to the end of the resolver DNS record. That change should be in 3.3.0. Can you please try that version and let me know if this is still an issue?
In my case it's: kube-dns
same error
root@node52:~# kubectl -n loki logs -f loki-gateway-774ff559b9-2w4dq
/docker-entrypoint.sh: No files found in /docker-entrypoint.d/, skipping configuration
2023/01/05 08:41:13 [emerg] 1#1: host not found in resolver "kube-dns.kube-system.svc.cluster.local." in /etc/nginx/nginx.conf:27
nginx: [emerg] host not found in resolver "kube-dns.kube-system.svc.cluster.local." in /etc/nginx/nginx.conf:27
and dns
root@node52:~# kubectl get svc --namespace=kube-system -l k8s-app=kube-dns -o jsonpath='{.items..metadata.name}'
coredns
and resloved by
global:
dnsService: "coredns"
I suspect this is related to the ndots
configuration in the /etc/resolv.conf
. May we see the resolver configuration please?
The solutin from seb-835 https://github.com/grafana/loki/issues/7287#issuecomment-1282339134 works for me
In my case the cluster dns was not resolving the cluster.local domain at all, the solution was to add also the clusterDomain. The installation was a k3s Cluster provisioned via Rancher 2.7.6 with Cluster Domain explicitly set.
Kubernetes Version: v1.25.13 +k3s1
Helm Chart:
global:
dnsService: "kube-dns"
dnsNamespace: "kube-system"
clusterDomain: "mysubdomain.mydomain.it"
Could be a nice option to have the possibility to set in the helm chart the IP of the DNS svc instead of the fqdn?
kubectl get svc --namespace=kube-system -l k8s-app=kube-dns -o jsonpath='{.items..metadata.name}'
kube-dns
/docker-entrypoint.sh: No files found in /docker-entrypoint.d/, skipping configuration
2023/12/21 22:00:31 [emerg] 1#1: host not found in resolver "kube-dns.kube-system.svc.cluster.local." in /etc/nginx/nginx.conf:33
nginx: [emerg] host not found in resolver "kube-dns.kube-system.svc.cluster.local." in /etc/nginx/nginx.conf:33
Encountered the same error when switching to Talos
from random container:
ping kube-dns.kube-system.svc.cluster.local.
PING kube-dns.kube-system.svc.cluster.local. (10.96.0.10): 56 data bytes
@batazor I got the same error when I run loki with gateway on Talos cluster. Have you found any solution?
IMHO may be related to https://github.com/grafana/loki/issues/11650
Same issue here, we have two GKE clusters and one is using DNS Kube-dns (loki works without any adjustments) and the second DNS is Cloud DNS (VPC scope) with specific Domain suffix.
As mentioned above we tried to change global.clusterDomain to Domain suffix and it works.
Getting the same error
The following config solve my probelm:
loki:
global:
dnsService: coredns
After installing loki simple scalable with help, the gateway log reports the following error:
But there are kube dns in my cluster