Closed macklenc closed 2 years ago
Just for some context, I'm still pretty new to K3s/Kubernetes so feel free to let me know if I left out an important config or details.
Could you check in what node is your coredns pod running? Then verify if nslookup works only on pods that are running in that same node
Sure thing. I had to get a little fancy since coredns was running on a control plane node. I created the following manifest:
---
apiVersion: v1
kind: Pod
metadata:
name: netshoot
spec:
restartPolicy: OnFailure
nodeSelector:
kubernetes.io/hostname: queen0
tolerations:
- key: CriticalAddonsOnly
operator: Equal
value: "true"
effect: NoExecute
containers:
- name: netshoot
image: nicolaka/netshoot
imagePullPolicy: IfNotPresent
command: ["nslookup"]
args: ["google.com"]
Which resulted in the same .home.net
getting appended:
~
❯ k -n kube-system get pods coredns-5cdc799f68-vfkd8 -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5cdc799f68-vfkd8 1/1 Running 0 21h 10.42.0.12 queen0 <none> <none>
~
❯ k get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
netshoot 0/1 Completed 2 (18s ago) 34s 10.42.0.20 queen0 <none> <none>
~
❯ k logs netshoot
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: google.com.home.net
Address: 163.237.192.146
I don't know if it's relevent, but this showed up in the events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 1s (x3 over 3s) kubelet MountVolume.SetUp failed for volume "kube-api-access-wvxn9" : object "default"/"kube-root-ca.crt" not registered
Also, it may be worth noting that the IP resolved is for some website out on the net called home.net
instead of resolving to my router's external IP. I tried adding a firewall rule to intercept external DNS queries and re-routing them to my network's DNS server, but it still resolves to the internet IP instead of my routers IP.
Sure thing. I had to get a little fancy since coredns was running on a control plane node. I created the following manifest:
--- apiVersion: v1 kind: Pod metadata: name: netshoot spec: restartPolicy: OnFailure nodeSelector: kubernetes.io/hostname: queen0 tolerations: - key: CriticalAddonsOnly operator: Equal value: "true" effect: NoExecute containers: - name: netshoot image: nicolaka/netshoot imagePullPolicy: IfNotPresent command: ["nslookup"] args: ["google.com"]
Which resulted in the same
.home.net
getting appended:~ ❯ k -n kube-system get pods coredns-5cdc799f68-vfkd8 -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES coredns-5cdc799f68-vfkd8 1/1 Running 0 21h 10.42.0.12 queen0 <none> <none> ~ ❯ k get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES netshoot 0/1 Completed 2 (18s ago) 34s 10.42.0.20 queen0 <none> <none> ~ ❯ k logs netshoot Server: 10.43.0.10 Address: 10.43.0.10#53 Name: google.com.home.net Address: 163.237.192.146
I don't know if it's relevent, but this showed up in the events:
Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 1s (x3 over 3s) kubelet MountVolume.SetUp failed for volume "kube-api-access-wvxn9" : object "default"/"kube-root-ca.crt" not registered
Also, it may be worth noting that the IP resolved is for some website out on the net called
home.net
instead of resolving to my router's external IP. I tried adding a firewall rule to intercept external DNS queries and re-routing them to my network's DNS server, but it still resolves to the internet IP instead of my routers IP.
Could you run cat /etc/resolv.conf
in both the "working" pod and the non-working pod?
Here's some of the containers:
~
❯ kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup google.com
Server: 10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local
Name: google.com
Address 1: 2607:f8b0:400f:807::200e den16s09-in-x0e.1e100.net
Address 2: 142.250.72.78 den16s09-in-f14.1e100.net
pod "busybox" deleted
~
❯ kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local home.net
nameserver 10.43.0.10
options ndots:5
pod "busybox" deleted
~
❯ kubectl run -it --rm --restart=Never netshoot --image=nicolaka/netshoot -- nslookup google.com
Server: 10.43.0.10
Address: 10.43.0.10#53
Non-authoritative answer:
Name: google.com.home.net
Address: 163.237.192.146
pod "netshoot" deleted
~
❯ kubectl run -it --rm --restart=Never netshoot --image=nicolaka/netshoot -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local home.net
nameserver 10.43.0.10
options ndots:5
pod "netshoot" deleted
The ubuntu image doesn't seem to have any networking tools, but the apt update returns the same incorrect IP is being resolved:
~ took 2s
❯ kubectl run -it --rm --restart=Never ubuntu --image=ubuntu -- apt update
Ign:1 http://security.ubuntu.com/ubuntu focal-security InRelease
Ign:2 http://archive.ubuntu.com/ubuntu focal InRelease
Err:3 http://security.ubuntu.com/ubuntu focal-security Release
404 Not Found [IP: 163.237.192.146 80]
Ign:4 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Ign:5 http://archive.ubuntu.com/ubuntu focal-backports InRelease
Err:6 http://archive.ubuntu.com/ubuntu focal Release
404 Not Found [IP: 163.237.192.146 80]
Err:7 http://archive.ubuntu.com/ubuntu focal-updates Release
404 Not Found [IP: 163.237.192.146 80]
Err:8 http://archive.ubuntu.com/ubuntu focal-backports Release
404 Not Found [IP: 163.237.192.146 80]
Reading package lists... Done
E: The repository 'http://security.ubuntu.com/ubuntu focal-security Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://archive.ubuntu.com/ubuntu focal Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://archive.ubuntu.com/ubuntu focal-updates Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: The repository 'http://archive.ubuntu.com/ubuntu focal-backports Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
pod "ubuntu" deleted
pod default/ubuntu terminated (Error)
~ took 3s
❯ kubectl run -it --rm --restart=Never ubuntu --image=ubuntu -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local home.net
nameserver 10.43.0.10
options ndots:5
pod "ubuntu" deleted
EDIT:
Looks like curlimages/curl
also works and has the same resolv.conf. It is an alpine based image if that helps.
Your search domains in /etc/resolv.conf
are default.svc.cluster.local svc.cluster.local cluster.local home.net
. Probably the home.net
is your hostname.
When you lookup google.com, it will try google.com.default.svc.cluster.local
, google.com.svc.cluster.local
, google.com.cluster.local
, google.com.home.net
and google.com
. The first one that comes with an answer is the chosen one. For strange reasons google.com.home.net
exists but you are not seeing a wrong behaviour
I recommend you to use the tool dig
instead of nslookup
. When using dig
, if you don't set the flag +search
, it will just try to resolve the provided url without trying to build FQDN based on your resolv.conf
I'm not sure what you mean by home.net
is my hostname. I double checked that the VMs running the cluster have different names, and the containers created have the same hostname as the name of the container when I created them. E.g. netshoot
.
Out of curiosity if that is normal behavior, why do busybox and alpine images seem to work while the distroless coredns and debian-based images do not? You're correct that dig
works as expected bypassing the resolv.conf
, but that unfortunately doesn't solve my problem (since e.g. apt
still uses resolv.conf
). I tried adding a domain override to my router to point home.net
to itself, as well as choosing a network name that I own, and both of them still tried to append the new network names even though they don't actually return.
E.g. overriding home.net
in my DNS:
❯ kubectl run -it --rm --restart=Never netshoot --image=nicolaka/netshoot -- nslookup google.com
Server: 10.43.0.10
Address: 10.43.0.10#53
Non-authoritative answer:
*** Can't find google.com.home.net: No answer
pod "netshoot" deleted
And using a domain I own, but has no DNS records (A, CNAME, or otherwise):
❯ kubectl run -it --rm --restart=Never netshoot --image=nicolaka/netshoot -- nslookup google.com
Server: 10.43.0.10
Address: 10.43.0.10#53
Non-authoritative answer:
*** Can't find google.com.continuumtek.com: No answer
pod "netshoot" deleted
This problem doesn't seem to exist on any other host on my network, bare metal, VM, docker, etc. Just when running on my new k3s cluster.
Sorry, I should not have used the word hostname, ignore that. Let me go through the basics (sorry if you already knew this).
The resolv.conf
of your pods is injected by kubelet. You can get more information about it here: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/. By default, kubelet keeps the search domains that it finds in the node's /etc/resolv.conf
and adds some extra related to the deployment (e.g. svc.cluster.local). Note that this is configurable as you can read in the previous link.
In your case, you must have search home.net
in the resolv.conf file of your node. By looking at the resolv.conf files of your pods, it seems kubelet is working correctly and injecting the correct resolv.conf file. Therefore, in my opinion, k3s is working correctly.
Now, regarding google.com.home.net
. I am not sure how each OS is implementing the read on resolv.conf
. According to its manual: https://www.man7.org/linux/man-pages/man5/resolv.conf.5.html, it will always try the different search domains before testing google.com alone (i.e. an initial absolute query) because it thinks that it is not a complete FQDN (unless it had 5 dots). This is because of the kubelet's default option ndots5 that you can see in the resolv.conf of the pod. This link has very good information about it: https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html and a couple of potential solutions you can apply.
One thing that surprises me is that you get problems when running apt. I think it is just bad luck that google.com.home.net
exists but I'd be surprised if you got also a "collision" with other urls. Could you show me an output of what you get when using apt?
In any case, due to what I explained, I don't think this is a problem of K3s or Kubernetes, in my opinion, what you are seeing is the expected behaviour
I'm not sure what you mean by
home.net
is my hostname. I double checked that the VMs running the cluster have different names, and the containers created have the same hostname as the name of the container when I created them. E.g.netshoot
.Out of curiosity if that is normal behavior, why do busybox and alpine images seem to work while the distroless coredns and debian-based images do not? You're correct that
dig
works as expected bypassing theresolv.conf
, but that unfortunately doesn't solve my problem (since e.g.apt
still usesresolv.conf
). I tried adding a domain override to my router to pointhome.net
to itself, as well as choosing a network name that I own, and both of them still tried to append the new network names even though they don't actually return.E.g. overriding
home.net
in my DNS:❯ kubectl run -it --rm --restart=Never netshoot --image=nicolaka/netshoot -- nslookup google.com Server: 10.43.0.10 Address: 10.43.0.10#53 Non-authoritative answer: *** Can't find google.com.home.net: No answer pod "netshoot" deleted
And using a domain I own, but has no DNS records (A, CNAME, or otherwise):
❯ kubectl run -it --rm --restart=Never netshoot --image=nicolaka/netshoot -- nslookup google.com Server: 10.43.0.10 Address: 10.43.0.10#53 Non-authoritative answer: *** Can't find google.com.continuumtek.com: No answer pod "netshoot" deleted
This problem doesn't seem to exist on any other host on my network, bare metal, VM, docker, etc. Just when running on my new k3s cluster.
It is really weird that the image netshoot
never tries to look up the "initial absolute query", i.e. google.com. If I understand correctly, according to https://www.man7.org/linux/man-pages/man5/resolv.conf.5.html, at some point it must try with google.com without adding any extra domain (what tries first, will depend on ndots config)
I always appreciate setting a good foundation, so thanks for that! I had actually found that ndots article earlier which was super helpful. I was able to "recreate" what I'm seeing by setting ndots:5
on a bare-metal host. Do you know why 5 is the default with kubelet and if that can be changed? Seems no matter how hard I try to block DNS traffic for the internal network name I choose, it's still able to get through and find e.g. that home.net IP address.
I have experimented with setting ndots
to 1 and removing the home.net
search path, and both have fixed the problem, so it seems like ndots:5
isn't the best default from my point of view. Unfortunately removing the search path home.net
from the host nodes does mess with the ability of the nodes to be able to pull down images to run. Not sure if I'm messing up a k3s config or what, but it's a pretty vanilla install. Maybe I'll give k8s a try with kubespray to see if I get the same behavior. I have also tried resetting my pfSense router back to factory defaults, which doesn't seem to fix the issue either.
I appreciate the help, I think I see how this isn't necessarily a k3s problem now but I'm at a total loss on how to fix this or where to seek help at this point.
Here is the reasoning for that ==> https://github.com/kubernetes/kubernetes/issues/33554#issuecomment-266251056
But apart from the google.com problem, do you get problems with other urls?
Thanks for the link, I'll take a look.
Yeah, that issue comes up with every URL I've tried. It gets really fun when accessing internal resources, e.g. pv0.home.net
turns into pve0.home.net.home.net
.
Oh... Interesting update ubuntu's apt update
is working now when using my domain name instead of home.net
. And I was mistaken, looks like netshoot
is alpine based, not debian based (which still isn't resolving as expected). Installing nslookup
on the ubuntu container shows the same behavior interestingly.
I really appreciate your assistance. I wasn't able to get home.net
working as my home network name, but after I removed my host from using cloudflare back to google domains that domain started to work. Super weird that the resolv DNS is able to bypass my local DNS overrides, but at least it works now.
Once again, thanks for your help. Unless you have further input, I'll go ahead and close the issue. If anyone else comes along in the future, the problem seems to be related to how the unbound
service in pfSense handles transparent traffic. If you want the external domain to be blocked you can set the system domain local zone type to static to block the outbound request, though I'm not sure what the side effects are, here's the relevant doc:
static
If there is a match from local data, the query is answered.
Otherwise, the query is answered with nodata or nxdomain.
For a negative answer a SOA is included in the answer if
present as local-data for the zone apex domain.
It seems that the general recommendation is to choose an internal TLD that doesn't exist on the internet, e.g. home.lan
instead.
Figured I'd follow up with a better solution than what I implemented if anyone else has this issue: https://forums.lawrencesystems.com/t/google-domains-vs-cloudflare-dns-with-ndots-5-in-resolv-conf/12887/3?u=macklenc
@macklenc Thank you so much. Your research fixed bunch of my issues. adding this to pfsense dns config made it easy
server:
local-zone: "example.com" static
Environmental Info: K3s Version: v1.22.5+k3s2 (for compatibility with rancher)
Node(s) CPU architecture, OS, and Version:
All nodes:
Cluster Configuration:
3 control plane nodes with integrated etcd, 2 worker nodes. Feel free to look at the deployment scripts: https://gitlab.com/macklenc/ha-k3s-ansible-deployment (yes, the secrets are placeholders).
Describe the bug:
In some containers, e.g. cert-manager and ubuntu, the DNS seems to append my home networks domain name to every request (my net name is
home.net
). E.g. running an nslookup command against google will show that the address that was actually looked up wasgoogle.com.home.net
, which works fine for http, but messes with SSL when trying to use e.g. Lets Encrypt. I realize from the linked issues below that there are some workarounds by either removing (in my case)home.net
from/etc/resolv.conf
inside the running container, or by changingndots:5
back to the default of 1 in the same file. Or even appending a.
to the domain name to force resolution as a FQDN. But all of these solutions are a bit hacky. Seems to me that deploying a fresh K3s cluster and deploying a container to e.g. curl an https domain should work out of the box.Now strangely, using a busybox container doesn't seem to reproduce the issue even though the
resolv.conf
file in the container is identical to the problematic containers. See the reproduction section for examples.Seemingly related issues.
Steps To Reproduce:
As an FYI, I'm running pfSense for my router. I did try resetting it to factory defaults which didn't seem to help the issue.
Launching a network debugging test image based on Debian:
will return:
however, using the busybox image as recommended from the guide, the DNS resolves just fine:
will return:
Expected behavior:
pings/curls/nslookups for a domain e.g. google.com, should return results for google.com.
Actual behavior:
pings/curls/nslookups for a domain e.g. google.com is returning google.com.home.net
Additional context / logs:
Host OS
resolv.conf
(maintained by systemd-resolvd):Control plane systemd service:
Agent systemd service:
Backporting