Open jsemohub opened 1 year ago
More output from the coredns pod
[ERROR] plugin/errors: 2 acme-v02.api.letsencrypt.org. AAAA: read udp 10.1.81.154:41110->8.8.4.4:53: read: no route to host [INFO] 10.1.81.153:39653 - 42585 "A IN acme-v02.api.letsencrypt.org. udp 57 false 1232" - - 0 2.001233069s [ERROR] plugin/errors: 2 acme-v02.api.letsencrypt.org. A: read udp 10.1.81.154:35430->8.8.4.4:53: i/o timeout [INFO] 10.1.81.153:46040 - 52173 "AAAA IN acme-v02.api.letsencrypt.org. udp 57 false 1232" - - 0 2.000694382s [ERROR] plugin/errors: 2 acme-v02.api.letsencrypt.org. AAAA: read udp 10.1.81.154:47142->8.8.4.4:53: i/o timeout [INFO] 10.1.81.153:35909 - 54411 "A IN acme-v02.api.letsencrypt.org. udp 57 false 1232" - - 0 2.000906637s [ERROR] plugin/errors: 2 acme-v02.api.letsencrypt.org. A: read udp 10.1.81.154:60004->8.8.4.4:53: i/o timeout
While I can easily lookup the address on the server using google DNS
nslookup google.com 8.8.4.4 Server: 8.8.4.4 Address: 8.8.4.4#53
Non-authoritative answer: Name: google.com Address: 142.250.72.110 Name: google.com Address: 2607:f8b0:4006:816::200e
Opening 53:upd port brings us to the next issue
[INFO] 10.1.81.153:46654 - 19085 "A IN acme-v02.api.letsencrypt.org. udp 57 false 1232" - - 0 0.000382567s [ERROR] plugin/errors: 2 acme-v02.api.letsencrypt.org. A: read udp 10.1.81.154:58691->8.8.8.8:53: read: no route to host
Hi @jsemohub, here is a suggestion, lets try to to get coredns use the host to forward requests. We switched to this approach in v1.26, so one option is to set your cluster with v1.26 (snap install microk8s --classic --channel=1.26
). To do this in a pre-1.26 cluster:
/var/snap/microk8s/current/args/kubelet
and add the argument --resolv-conf=/run/systemd/resolve/resolv.conf
(/run/systemd/resolve/resolv.conf
should be on the host).microk8s stop; microk8s start;
on al nodes) edit the CoreDNS config map microk8s.kubectl edit ConfigMap coredns -n kube-system
and set the upstream dns to forward . /etc/resolv.conf
This process (with some extra failsafe logic) is in the latest dns enable script found at https://github.com/canonical/microk8s-core-addons/blob/main/addons/dns/enable
I am using 1.26 channel BTW. Updated and restarted.
Still no connection.
[root@node1 k8s]# k exec -it pod/nginx-ingress-microk8s-controller-jnwph -n ingress -- bash bash-5.1$ cat /etc/resolv.conf search ingress.svc.cluster.local svc.cluster.local cluster.local nameserver 10.152.183.10 options ndots:5 bash-5.1$ nslookup google.com nslookup: read: Host is unreachable nslookup: read: Host is unreachable nslookup: read: Host is unreachable ^C bash-5.1$ nslookup google.com 8.8.8.8 nslookup: write to '8.8.8.8': Host is unreachable ;; connection timed out; no servers could be reached
I think I there is no internet access to the outside ips as in can't reach google's public DNS 8.8.8.8. Is there a quick way to troubleshoot/add calico static route?
Two independent problems seem to be contributing to the issue:
Any help with this would be much appreciated.
Routing issues persist on machines with 3 nics. On these boxes, event after a fresh microk8s install calico setup does not seem to be working and consequently DNS.
The mystery has been solved. For some reason, 10.1.0.0/16 firewall zone was NOT created by calico on some servers automatically. After manually adding a microk8s-cluster zone, I am able to proceed with creating ha-cluster. Keeping fingers crossed.
Got quite a bit further. Got stuck on cert-manager. ping cert-manager-webhook.cert-manager.svc PING cert-manager-webhook.cert-manager.svc (10.152.183.186): 56 data bytes
Will open another ticket...
This process (with some extra failsafe logic) is in the latest dns enable script found at https://github.com/canonical/microk8s-core-addons/blob/main/addons/dns/enable
The find-resolv-conf.py that is at the heart of that has one major issue with IPv6 - to work correctly with IPv6 addresses with scope ids it needs python > 3.8 which, sadly, microk8s 1.27/stable and 1.28/stable (afaik) do not have. This means that on a system here where /run/systemd/resolve/resolv.conf contains the following
nameserver 192.168.121.1 nameserver fe80::5054:ff:fe00:b61d%2
the find-resolv-conf.py judges theses as unsafe, not because there is a loopback address, but because ipaddress.IPv6Address('fe80::5054:ff:fe00:b61d%2')
explodes hideously on python 3.8 with a AddressValueError.
Summary
After enabling dns plugin I login into one of the pods and run
nslookup google
Command failsWhat Should Happen Instead?
It should lookup a set of google ips
Reproduction Steps
nslookup google.com
Introspection Report
Here are session outputs
DNS fails to lookup ip k exec -it nginx-ingress-microk8s-controller-m8ddh -n ingress -- bash bash-5.1$ nslookup google.com Server: 10.152.183.10 Address: 10.152.183.10:53
** server can't find google.com: SERVFAIL
It doesn't work. Here looking at the resolve. cat /etc/resolv.conf search ingress.svc.cluster.local svc.cluster.local cluster.local nameserver 10.152.183.10 options ndots:5
k get ConfigMap coredns -n kube-system -o yaml apiVersion: v1 data: Corefile: | .:53 { errors health { lameduck 5s } ready log . { class error } kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . 8.8.8.8 8.8.4.4 cache 30 loop reload loadbalance } kind: ConfigMap metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","data":{"Corefile":".:53 {\n errors\n health {\n lameduck 5s\n }\n ready\n log . {\n class error\n }\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n pods insecure\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . 8.8.8.8 8.8.4.4\n cache 30\n loop\n reload\n loadbalance\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"EnsureExists","k8s-app":"kube-dns"},"name":"coredns","namespace":"kube-system"}} creationTimestamp: "2023-01-05T00:41:09Z" labels: addonmanager.kubernetes.io/mode: EnsureExists k8s-app: kube-dns name: coredns namespace: kube-system resourceVersion: "259864" uid: 1b6fc72b-eabd-4139-b2de-9764d08b6553
Can you suggest a fix?
Are you interested in contributing with a fix?