Closed flixman closed 3 weeks ago
What does cannot reach the internet mean? Your error shows a problem resolving a dns name, do you actually have no network connectivity or is just dns failing?
Does dns/networking work inside podman unshare --rootless-netns
?
@Luap99 That is interesting! Let's see:
Running in my container with a custom network created through podman network create --subnet 10.1.0.0/24 --gateway 10.1.0.1 testnet
:
telnet 10.1.0.1 53, works
telnet 8.8.8.8 53, works
dig www.google.com @8.8.8.8, works
dig www.google.com: error ";; communications error to 10.1.0.1#53: timed out"
Running inside podman unshare --rootless-netns
:
dig www.google.com, works
How is possible that I can telnet, from inside my container, to port 53... but then dig returns an error??
Ok thanks for checking, this means that aardvark-dns is not responding on udp I would guess. telnet uses tcp not udp. You could try to use dig +tcp ...
to see if dns works on tcp.
Can you a check that aardvark-dns is running (when you have the container running) and if so please provide the output of podman unshare --rootless-netns ss -tulpn
.
dig +tcp ...
returns the timeout as well, and aardvark-dns is running. The output of podman unshare --rootless-netns ss -tulpn
is:
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
udp UNCONN 0 0 10.1.0.1:53 0.0.0.0:* users:(("aardvark-dns",pid=38793,fd=12))
tcp LISTEN 0 1024 10.1.0.1:53 0.0.0.0:* users:(("aardvark-dns",pid=38793,fd=13))
Additionally: should I attach strace to the running aardvark-dns and its forks, when doing the dig (with eider udp or tcp), I get similar traces:
[pid 38801] accept4(13, {sa_family=AF_INET, sin_port=htons(35163), sin_addr=inet_addr("10.1.0.3")}, [128 => 16], SOCK_CLOEXEC|SOCK_NONBLOCK) = 5
[pid 38801] epoll_ctl(7, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1342190720, u64=140416707998848}}) = 0
[pid 38801] accept4(13, 0x7fb59b5fb9d0, [128], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid 38801] write(6, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 38801] epoll_wait(4, [{events=EPOLLIN|EPOLLOUT, data={u32=1342190720, u64=140416707998848}}, {events=EPOLLIN, data={u32=0, u64=0}}], 1024, 2956) = 2
[pid 38801] recvfrom(5, "\0007", 2, 0, NULL, NULL) = 2
[pid 38801] recvfrom(5, "\260\356\1 \0\1\0\0\0\0\0\1\3www\6google\3com\0\0\1\0\1"..., 55, 0, NULL, NULL) = 55
[pid 38801] socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 14
[pid 38801] connect(14, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("169.254.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 38801] epoll_ctl(7, EPOLL_CTL_ADD, 14, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1946168448, u64=140417311976576}}) = 0
[pid 38801] epoll_wait(4, [], 1024, 3212) = 0
[pid 38801] epoll_wait(4, [], 1024, 1726) = 0
[pid 38801] epoll_wait(4, [], 1024, 59) = 0
[pid 38801] epoll_ctl(7, EPOLL_CTL_DEL, 14, NULL) = 0
[pid 38801] close(14) = 0
[pid 38801] socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 14
[pid 38801] connect(14, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.178.4")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 38801] epoll_ctl(7, EPOLL_CTL_ADD, 14, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=2348821376, u64=140417714629504}}) = 0
[pid 38801] write(6, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 38801] epoll_wait(4, [{events=EPOLLIN, data={u32=0, u64=0}}], 1024, 2306) = 1
[pid 38801] epoll_wait(4, [], 1024, 2306) = 0
[pid 38801] epoll_wait(4, [], 1024, 2686) = 0
[pid 38801] epoll_wait(4, [{events=EPOLLIN, data={u32=3002696448, u64=99616179192576}}], 1024, 4) = 1
[pid 38801] accept4(13, {sa_family=AF_INET, sin_port=htons(34497), sin_addr=inet_addr("10.1.0.3")}, [128 => 16], SOCK_CLOEXEC|SOCK_NONBLOCK) = 15
[pid 38801] epoll_ctl(7, EPOLL_CTL_ADD, 15, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1946168448, u64=140417311976576}}) = 0
[pid 38801] accept4(13, 0x7fb59b5fb9d0, [128], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid 38801] epoll_wait(4, [{events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=1342190720, u64=140416707998848}}, {events=EPOLLIN|EPOLLOUT, data={u32=1946168448, u64=140417311976576}}], 1024, 3) = 2
[pid 38801] recvfrom(15, "\0007", 2, 0, NULL, NULL) = 2
[pid 38801] recvfrom(15, "%\332\1 \0\1\0\0\0\0\0\1\3www\6google\3com\0\0\1\0\1"..., 55, 0, NULL, NULL) = 55
[pid 38801] socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 17
[pid 38801] connect(17, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("169.254.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 38801] epoll_ctl(7, EPOLL_CTL_ADD, 17, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1946170880, u64=140417311979008}}) = 0
[pid 38801] epoll_wait(4, [], 1024, 2) = 0
[pid 38801] epoll_ctl(7, EPOLL_CTL_DEL, 14, NULL) = 0
[pid 38801] close(14) = 0
[pid 38801] socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 14
[pid 38801] connect(14, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("84.116.46.21")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 38801] epoll_ctl(7, EPOLL_CTL_ADD, 14, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1946162816, u64=140417311970944}}) = 0
[pid 38801] epoll_wait(4, [{events=EPOLLOUT, data={u32=1946162816, u64=140417311970944}}], 1024, 1401) = 1
[pid 38801] getsockopt(14, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
[pid 38801] setsockopt(14, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 38801] sendto(14, "\0007", 2, MSG_NOSIGNAL, NULL, 0) = 2
[pid 38801] sendto(14, "^}\1 \0\1\0\0\0\0\0\1\3www\6google\3com\0\0\1\0\1"..., 55, MSG_NOSIGNAL, NULL, 0) = 55
[pid 38801] futex(0x5a99b2f936f8, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 38801] epoll_wait(4, <unfinished ...>
[pid 38799] <... futex resumed>) = 0
[pid 38799] futex(0x5a99b2f936f8, FUTEX_WAIT_BITSET_PRIVATE, 12, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 38801] <... epoll_wait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=1946162816, u64=140417311970944}}], 1024, 1369) = 1
[pid 38801] recvfrom(14, "\0;", 2, 0, NULL, NULL) = 2
[pid 38801] recvfrom(14, "^}\201\200\0\1\0\1\0\0\0\1\3www\6google\3com\0\0\1\0\1"..., 59, 0, NULL, NULL) = 59
[pid 38801] recvfrom(14, 0x7fb574004cb0, 2, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
meaning: the request reaches aardvark-dns in both cases, but seems that aardvark-dns is not able to query the DNS itself?
Do you have any aardvark-dns errors logged in journald?
The strace part shows a tcp request if I read this right. The async epoll API that we are using makes reading the strace a bit harder but it seems we are trying to connect to upstream servers but then it just removes the fd from the epoll again but I do not see any error logged or any write/read from the socket which seems very odd. Although in the end it seems to succeed when connecting 84.116.46.21 but I guess by that time the original client timed out (ref https://github.com/containers/aardvark-dns/issues/482#issuecomment-2253110977)
What is the content of /etc/resolv.conf
on the host and inside podman unshare --rootless-netns
? And when you say dig inside podman unshare --rootless-netns
worked which upstream server did it use?
In /etc/resolv.conf I have a bunch of name servers, and inside podman unshare --rootless-netns
I have the same, but a new one gets prepended to the list nameserver 169.254.0.1
. When executing dig www.google.com
inside podman unshare --rootless-netns
I get three timeouts for 169.254.0.1, and then successfully works for another one (using UDP, by the way).
With the container running, dig www.google.com results on aardvark-dns on the host writting a number of "dns request got empty response" messages on the log.
169.254.0.1
This is the special dns forward address we use for pasta so this address is expected to work there. If it doesn't it sounds like pasta bug, if you look in journald do you see a warning from pasta that it didn't find nameservers?
You can also just test from the cli with pasta --config-net --dns-forward 169.254.0.1 dig google.com @169.254.0.1
. If this fails this is a pasta bug.
Indeed, it fails:
$ pasta --config-net --dns-forward 169.254.0.1 dig google.com @169.254.0.1
Multiple default IPv4 routes, picked first
Multiple default IPv6 routes, picked first
;; communications error to 169.254.0.1#53: timed out
;; communications error to 169.254.0.1#53: timed out
;; communications error to 169.254.0.1#53: timed out
; <<>> DiG 9.20.1 <<>> google.com @169.254.0.1
;; global options: +cmd
;; no servers could be reached
Multiple default IPv4 routes, picked first Multiple default IPv6 routes, picked first
How do the routes look like in the container (pasta --config-net ip route
)?, if the routes are fine then you can use the --pcap option to capture a pcap file so we can have a look at the packages being send, i.e.
pasta --config-net --pcap /tmp/dns.pcap --dns-forward 169.254.0.1 dig google.com @169.254.0.1
cc @sbrivio-rh @dgibson
The routes seem to be fine:
$ pasta --config-net ip route
Multiple default IPv4 routes, picked first
Multiple default IPv6 routes, picked first
default via 192.168.178.1 dev wlp2s0 proto dhcp metric 600
84.116.46.20 via 192.168.178.1 dev wlp2s0 proto dhcp metric 600
84.116.46.21 via 192.168.178.1 dev wlp2s0 proto dhcp metric 600
192.168.178.0/24 dev wlp2s0 proto kernel scope link metric 600
192.168.178.0/24 dev wlp2s0 proto kernel scope link src 192.168.178.129 metric 600
192.168.178.1 dev wlp2s0 proto dhcp scope link metric 600
Please, find attached the trace dns.pcap.txt (remove the .txt suffix. Seems GH does not support .pcap):
Please, find attached the trace dns.pcap.txt (remove the .txt suffix. Seems GH does not support .pcap):
4 0.007273 192.168.178.129 → 169.254.0.1 DNS 93 Standard query 0xaab5 A google.com OPT
12 5.012949 192.168.178.129 → 169.254.0.1 DNS 93 Standard query 0xaab5 A google.com OPT
13 10.018316 192.168.178.129 → 169.254.0.1 DNS 93 Standard query 0xaab5 A google.com OPT
The requests was send out but never a reply, can you also do a packet capture on the host to see if pasta makes a actual requests to the upstream server there or if pasta eats it internally and never forwards. I wonder is pasta somehow failed to parse resolv.conf for the servers but in this case it should it should print this as warning like the "multiple default routes" warning. There is also --debug
pasta option which also logs the internal packet flow so maybe there is something interesting in there.
But I guess at this point I have to leave it to @sbrivio-rh and @dgibson (the pasta maintainers) if they have a clue here.
@sbrivio-rh @dgibson: I have run again the pasta command with the --debug option. Can you guys give me a hand?
$ pasta --debug --config-net --dns-forward 169.254.0.1 dig google.com @169.254.0.1
0.0010: Multiple default IPv4 routes, picked first
0.0010: Multiple default IPv6 routes, picked first
0.0118: Template interface: wlp2s0 (IPv4), wlp2s0 (IPv6)
0.0118: Namespace interface: wlp2s0
0.0118: MAC:
0.0118: host: 9a:55:9a:55:9a:55
0.0118: NAT to host 127.0.0.1: 192.168.178.1
0.0118: DHCP:
0.0119: assign: 192.168.178.129
0.0119: mask: 255.255.255.0
0.0119: router: 192.168.178.1
0.0119: DNS:
0.0119: 192.168.178.4
0.0119: 84.116.46.21
0.0119: 84.116.46.20
0.0119: 84.116.46.21
0.0119: 169.254.0.1
0.0119: 192.168.178.1
0.0119: 192.168.178.4
0.0119: DNS search list:
0.0119: .
0.0119: NAT to host ::1: fe80::4ad3:43ff:feda:bb88
0.0119: NDP/DHCPv6:
0.0120: assign: 2001:1c00:1804:b700:f2b3:fadc:4fa3:f578
0.0120: router: fe80::4ad3:43ff:feda:bb88
0.0120: our link-local: fe80::4ad3:43ff:feda:bb88
0.0120: DNS:
0.0120: 2001:b88:1002::10
0.0120: 2001:b88:1202::10
0.0120: 2001:730:3e42:1000::53
0.0120: 2001:b88:1002::10
0.0120: DNS search list:
0.0120: .
0.0186: SO_PEEK_OFF not supported
0.0305: Flow 0 (NEW): FREE -> NEW
0.0305: Flow 0 (INI): NEW -> INI
0.0305: Flow 0 (INI): TAP [192.168.178.129]:39909 -> [169.254.0.1]:53 => ?
0.0306: Flow 0 (TGT): INI -> TGT
0.0306: Flow 0 (TGT): TAP [192.168.178.129]:39909 -> [169.254.0.1]:53 => HOST [0.0.0.0]:39909 -> [192.168.178.4]:53
0.0306: Flow 0 (UDP flow): TGT -> TYPED
0.0306: Flow 0 (UDP flow): TAP [192.168.178.129]:39909 -> [169.254.0.1]:53 => HOST [0.0.0.0]:39909 -> [192.168.178.4]:53
0.0308: Flow 0 (UDP flow): Side 0 hash table insert: bucket: 41306
0.0308: Flow 0 (UDP flow): TYPED -> ACTIVE
0.0308: Flow 0 (UDP flow): TAP [192.168.178.129]:39909 -> [169.254.0.1]:53 => HOST [0.0.0.0]:39909 -> [192.168.178.4]:53
0.0487: ICMP error on UDP socket 179: No route to host
;; communications error to 169.254.0.1#53: timed out
5.0351: Flow 1 (NEW): FREE -> NEW
5.0351: Flow 1 (INI): NEW -> INI
5.0351: Flow 1 (INI): TAP [192.168.178.129]:57747 -> [169.254.0.1]:53 => ?
5.0351: Flow 1 (TGT): INI -> TGT
5.0352: Flow 1 (TGT): TAP [192.168.178.129]:57747 -> [169.254.0.1]:53 => HOST [0.0.0.0]:57747 -> [192.168.178.4]:53
5.0352: Flow 1 (UDP flow): TGT -> TYPED
5.0352: Flow 1 (UDP flow): TAP [192.168.178.129]:57747 -> [169.254.0.1]:53 => HOST [0.0.0.0]:57747 -> [192.168.178.4]:53
5.0353: Flow 1 (UDP flow): Side 0 hash table insert: bucket: 235154
5.0353: Flow 1 (UDP flow): TYPED -> ACTIVE
5.0353: Flow 1 (UDP flow): TAP [192.168.178.129]:57747 -> [169.254.0.1]:53 => HOST [0.0.0.0]:57747 -> [192.168.178.4]:53
5.0498: ICMP error on UDP socket 244: No route to host
;; communications error to 169.254.0.1#53: timed out
10.0406: Flow 2 (NEW): FREE -> NEW
10.0406: Flow 2 (INI): NEW -> INI
10.0407: Flow 2 (INI): TAP [192.168.178.129]:59697 -> [169.254.0.1]:53 => ?
10.0407: Flow 2 (TGT): INI -> TGT
10.0407: Flow 2 (TGT): TAP [192.168.178.129]:59697 -> [169.254.0.1]:53 => HOST [0.0.0.0]:59697 -> [192.168.178.4]:53
10.0407: Flow 2 (UDP flow): TGT -> TYPED
10.0407: Flow 2 (UDP flow): TAP [192.168.178.129]:59697 -> [169.254.0.1]:53 => HOST [0.0.0.0]:59697 -> [192.168.178.4]:53
10.0408: Flow 2 (UDP flow): Side 0 hash table insert: bucket: 10518
10.0408: Flow 2 (UDP flow): TYPED -> ACTIVE
10.0408: Flow 2 (UDP flow): TAP [192.168.178.129]:59697 -> [169.254.0.1]:53 => HOST [0.0.0.0]:59697 -> [192.168.178.4]:53
10.0590: ICMP error on UDP socket 245: No route to host
;; communications error to 169.254.0.1#53: timed out
; <<>> DiG 9.20.1 <<>> google.com @169.254.0.1
;; global options: +cmd
;; no servers could be reached
I was looking into this right now. Quick question: is 2001:b88:1002::10 a valid resolver? What happens if you dig passt.top @2001:b88:1002::10
?
Same for 192.168.178.4: does it work?
Thank you for your help! Yes, 192.168.178.4 is valid. About the "dig passt.top @2001:b88:1002::10", this also seems to work:
$ pasta --config-net --dns-forward 169.254.0.1 dig passt.top @2001:b88:1002::10
Multiple default IPv4 routes, picked first
Multiple default IPv6 routes, picked first
; <<>> DiG 9.20.1 <<>> passt.top @2001:b88:1002::10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40536
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;passt.top. IN A
;; ANSWER SECTION:
passt.top. 300 IN A 88.198.0.164
;; Query time: 60 msec
;; SERVER: 2001:b88:1002::10#53(2001:b88:1002::10) (UDP)
;; WHEN: Wed Sep 18 20:45:38 CEST 2024
;; MSG SIZE rcvd: 54
Weird, because when pasta (and not a process running under pasta) tries to contact 192.168.178.4, it gets an error ("No route to host"). That might be an ICMP error or netfilter (nftables or iptables) blocking it.
How do routes look like on the host (not the ones pasta copies)? Any particular firewalling rule pasta could hit?
with respect to the routes on the host, this is how they look like:
$ ip route
default via 192.168.178.1 dev wlp2s0 proto dhcp src 192.168.178.129 metric 600
default via 192.168.178.1 dev eno1 proto dhcp src 192.168.178.213 metric 800
84.116.46.20 via 192.168.178.1 dev wlp2s0 proto dhcp src 192.168.178.129 metric 600
84.116.46.20 via 192.168.178.1 dev eno1 proto dhcp src 192.168.178.213 metric 800
84.116.46.21 via 192.168.178.1 dev wlp2s0 proto dhcp src 192.168.178.129 metric 600
84.116.46.21 via 192.168.178.1 dev eno1 proto dhcp src 192.168.178.213 metric 800
192.168.178.0/24 dev wlp2s0 proto kernel scope link src 192.168.178.129 metric 600
192.168.178.0/24 dev eno1 proto kernel scope link src 192.168.178.213 metric 800
192.168.178.1 dev wlp2s0 proto dhcp scope link src 192.168.178.129 metric 600
192.168.178.1 dev eno1 proto dhcp scope link src 192.168.178.213 metric 800
and about the firewall settings, I do not have any rules in nft that can justify this behavior: nft_rules.txt
Well, I'm deeply baffled. You're able to manually contact the DNS server from the host, but when pasta tries it gets an ICMP error. We could try to get a packet capture on the host - perhaps that would shed some more light on where the error is originating. In fact, even better would be to get two different packet traces on the host: one querying the nameserver directly from the host with dig, the second doing a similar query from the container via pasta. Perhaps we'll see some difference that helps explain things.
...or maybe it has something to do with us bind()ing and connect()ing UDP sockets (dig doesn't do that) when two sets of almost-identical routes (metrics, interface, and source differ) are present?
I can try and see if it can be reproduced with a dummy interface with similar routes.
...or maybe it has something to do with us bind()ing and connect()ing UDP sockets (dig doesn't do that)
Doesn't bind() or doesn't connect()? I'm pretty sure it has to do one of them in order to receive anything at all.
when two sets of almost-identical routes (metrics, interface, and source differ) are present?
I can try and see if it can be reproduced with a dummy interface with similar routes.
...or maybe it has something to do with us bind()ing and connect()ing UDP sockets (dig doesn't do that)
Doesn't bind() or doesn't connect()? I'm pretty sure it has to do one of them in order to receive anything at all.
Whoops, sorry, I just assumed. It actually does both:
$ strace -e connect,bind dig root-servers.net @1.1.1.1 >/dev/null
bind(11, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
connect(11, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("1.1.1.1")}, 16) = 0
+++ exited with 0 +++
but it bind()s to 0.0.0.0, port 0, so that's not quite the bind()ing we do.
I think binding to 0.0.0.0:0 is basically a no-op. Which means I thnk the kernel will implicitly bind the socket at connect()
time to an address and port of the kernel's choosing.
@dgibson Sorry for not answering yesterday, had a pretty busy day. About your request "[...] one querying the nameserver directly from the host with dig, the second doing a similar query from the container via pasta. [...]": Do you mean something like the trace I provided on this comment, for the host?
Besides this: I have just updated the system and rebooted it. I have also disabled the DNS server I had in 192.168.178.4 to use the one provided by my router (192.168.178.1), and I have removed one of the interfaces. None of this has helped in solving this issue :-/ (but I have a cleaner system, I guess xD). These are the results:
$ podman unshare --rootless-netns more /etc/resolv.conf
nameserver 169.254.0.1
nameserver 192.168.178.1
nameserver 84.116.46.21
nameserver 84.116.46.20
nameserver 2001:b88:1002::10
nameserver 2001:b88:1202::10
nameserver 2001:730:3e42:1000::53
$ podman unshare --rootless-netns ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host proto kernel_lo
valid_lft forever preferred_lft forever
2: wlp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UNKNOWN group default qlen 1000
link/ether ee:65:d6:01:89:0d brd ff:ff:ff:ff:ff:ff
inet 192.168.178.129/24 metric 600 brd 192.168.178.255 scope global wlp2s0
valid_lft forever preferred_lft forever
inet6 2001:1c00:1804:b700:3e55:76ff:fe0f:c901/64 scope global nodad mngtmpaddr noprefixroute
valid_lft forever preferred_lft forever
inet6 2001:1c00:1804:b700:1034:e18c:123f:2add/64 scope global nodad
valid_lft forever preferred_lft forever
inet6 fe80::ec65:d6ff:fe01:890d/64 scope link nodad tentative proto kernel_ll
valid_lft forever preferred_lft forever
$ podman unshare --rootless-netns ip route
default via 192.168.178.1 dev wlp2s0 proto dhcp metric 600
84.116.46.20 via 192.168.178.1 dev wlp2s0 proto dhcp metric 600
84.116.46.21 via 192.168.178.1 dev wlp2s0 proto dhcp metric 600
192.168.178.0/24 dev wlp2s0 proto kernel scope link metric 600
192.168.178.0/24 dev wlp2s0 proto kernel scope link src 192.168.178.129 metric 600
192.168.178.1 dev wlp2s0 proto dhcp scope link metric 600
$ podman unshare --rootless-netns dig passt.top
;; communications error to 169.254.0.1#53: timed out
;; communications error to 169.254.0.1#53: timed out
;; communications error to 169.254.0.1#53: timed out
;; communications error to 192.168.178.1#53: timed out
; <<>> DiG 9.20.2 <<>> passt.top
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45940
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;passt.top. IN A
;; ANSWER SECTION:
passt.top. 300 IN A 88.198.0.164
;; Query time: 33 msec
;; SERVER: 84.116.46.21#53(84.116.46.21) (UDP)
;; WHEN: Sat Sep 21 10:44:18 CEST 2024
;; MSG SIZE rcvd: 54
$ podman unshare --rootless-netns dig passt.top @192.168.178.1
;; communications error to 192.168.178.1#53: timed out
;; communications error to 192.168.178.1#53: timed out
;; communications error to 192.168.178.1#53: timed out
; <<>> DiG 9.20.2 <<>> passt.top @192.168.178.1
;; global options: +cmd
;; no servers could be reached
$ podman unshare --rootless-netns dig passt.top @84.116.46.21
; <<>> DiG 9.20.2 <<>> passt.top @84.116.46.21
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9750
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;passt.top. IN A
;; ANSWER SECTION:
passt.top. 300 IN A 88.198.0.164
;; Query time: 40 msec
;; SERVER: 84.116.46.21#53(84.116.46.21) (UDP)
;; WHEN: Sat Sep 21 10:45:11 CEST 2024
;; MSG SIZE rcvd: 54
$ podman unshare --rootless-netns dig passt.top @192.168.178.1 +tcp
;; Connection to 192.168.178.1#53(192.168.178.1) for passt.top failed: timed out.
;; no servers could be reached
So: seems that the problem is not related to tcp/udp connections, it works with a remote DNS server but not with that of my router (I have also rebooted the router). Should I resolve querying 84.116.46.21 from the host, this works... but not when setting it as a dns forward resolver in pasta:
$ dig google.com @84.116.46.21
; <<>> DiG 9.20.2 <<>> google.com @84.116.46.21
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 719
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 63 IN A 142.250.179.174
;; Query time: 33 msec
;; SERVER: 84.116.46.21#53(84.116.46.21) (UDP)
;; WHEN: Sat Sep 21 10:49:29 CEST 2024
;; MSG SIZE rcvd: 55
$ pasta --config-net --dns-forward 84.116.46.21 dig google.com @169.254.0.1
;; communications error to 169.254.0.1#53: timed out
;; communications error to 169.254.0.1#53: timed out
;; communications error to 169.254.0.1#53: timed out
; <<>> DiG 9.20.2 <<>> google.com @169.254.0.1
;; global options: +cmd
;; no servers could be reached
@dgibson Sorry for not answering yesterday, had a pretty busy day. About your request "[...] one querying the nameserver directly from the host with dig, the second doing a similar query from the container via pasta. [...]": Do you mean something like the trace I provided on this comment, for the host?
Roughly, yes. The most important thing is getting the trace from the host, not from the container or pasta as that earlier trace was. But then it would also be useful to see the difference in trace between running dig directly on the host, and running dig within the container.
Besides this: I have just updated the system and rebooted it. I have also disabled the DNS server I had in 192.168.178.4 to use the one provided by my router (192.168.178.1), and I have removed one of the interfaces. None of this has helped in solving this issue :-/ (but I have a cleaner system, I guess xD). These are the results:
Right. I wouldn't particularly expect those changes to make any difference here... but then the symptoms we're seeing are so weird, I don't really know for sure.
$ podman unshare --rootless-netns more /etc/resolv.conf nameserver 169.254.0.1 nameserver 192.168.178.1 nameserver 84.116.46.21 nameserver 84.116.46.20 nameserver 2001:b88:1002::10 nameserver 2001:b88:1202::10 nameserver 2001:730:3e42:1000::53 $ podman unshare --rootless-netns ip add 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host proto kernel_lo valid_lft forever preferred_lft forever 2: wlp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UNKNOWN group default qlen 1000 link/ether ee:65:d6:01:89:0d brd ff:ff:ff:ff:ff:ff inet 192.168.178.129/24 metric 600 brd 192.168.178.255 scope global wlp2s0 valid_lft forever preferred_lft forever inet6 2001:1c00:1804:b700:3e55:76ff:fe0f:c901/64 scope global nodad mngtmpaddr noprefixroute valid_lft forever preferred_lft forever inet6 2001:1c00:1804:b700:1034:e18c:123f:2add/64 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::ec65:d6ff:fe01:890d/64 scope link nodad tentative proto kernel_ll valid_lft forever preferred_lft forever $ podman unshare --rootless-netns ip route default via 192.168.178.1 dev wlp2s0 proto dhcp metric 600 84.116.46.20 via 192.168.178.1 dev wlp2s0 proto dhcp metric 600 84.116.46.21 via 192.168.178.1 dev wlp2s0 proto dhcp metric 600 192.168.178.0/24 dev wlp2s0 proto kernel scope link metric 600 192.168.178.0/24 dev wlp2s0 proto kernel scope link src 192.168.178.129 metric 600 192.168.178.1 dev wlp2s0 proto dhcp scope link metric 600 $ podman unshare --rootless-netns dig passt.top ;; communications error to 169.254.0.1#53: timed out ;; communications error to 169.254.0.1#53: timed out ;; communications error to 169.254.0.1#53: timed out ;; communications error to 192.168.178.1#53: timed out ; <<>> DiG 9.20.2 <<>> passt.top ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45940 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;passt.top. IN A ;; ANSWER SECTION: passt.top. 300 IN A 88.198.0.164 ;; Query time: 33 msec ;; SERVER: 84.116.46.21#53(84.116.46.21) (UDP) ;; WHEN: Sat Sep 21 10:44:18 CEST 2024 ;; MSG SIZE rcvd: 54 $ podman unshare --rootless-netns dig passt.top @192.168.178.1 ;; communications error to 192.168.178.1#53: timed out ;; communications error to 192.168.178.1#53: timed out ;; communications error to 192.168.178.1#53: timed out ; <<>> DiG 9.20.2 <<>> passt.top @192.168.178.1 ;; global options: +cmd ;; no servers could be reached $ podman unshare --rootless-netns dig passt.top @84.116.46.21 ; <<>> DiG 9.20.2 <<>> passt.top @84.116.46.21 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9750 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;passt.top. IN A ;; ANSWER SECTION: passt.top. 300 IN A 88.198.0.164 ;; Query time: 40 msec ;; SERVER: 84.116.46.21#53(84.116.46.21) (UDP) ;; WHEN: Sat Sep 21 10:45:11 CEST 2024 ;; MSG SIZE rcvd: 54 $ podman unshare --rootless-netns dig passt.top @192.168.178.1 +tcp ;; Connection to 192.168.178.1#53(192.168.178.1) for passt.top failed: timed out. ;; no servers could be reached
So: seems that the problem is not related to tcp/udp connections, it works with a remote DNS server but not with that of my router (I have also rebooted the router).
That's what's so odd. The queries are failing if they're both to the local nameserver and from pasta. Other combinations appear to be working.
Should I resolve querying 84.116.46.21 from the host, this works... but not when setting it as a dns forward resolver in pasta:
Actually, this one makes sense.
$ pasta --config-net --dns-forward 84.116.46.21 dig google.com @169.254.0.1
--dns-forward
sets the address pasta forwards from, not the address it forwards to. So with this setting, pasta is no longer forwarding queries @169.254.0.1
, so a timeout is expected.
pasta has an internal concept of the "host DNS" which is where it directs queries to once it's forwarded them. But... it looks like the only way to configure that is via the host's resolv.conf
, which is a bit of an oversight. @sbrivio-rh , did I miss something?
;; communications error to 169.254.0.1#53: timed out ;; communications error to 169.254.0.1#53: timed out ;; communications error to 169.254.0.1#53: timed out ; <<>> DiG 9.20.2 <<>> google.com @169.254.0.1 ;; global options: +cmd ;; no servers could be reached
@dgibson seems I have gotten somewhere now. If I remove the IP of my router from the resolv.conf, it works:
; <<>> DiG 9.20.2 <<>> google.com @169.254.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49114
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 112 IN A 142.250.179.174
;; Query time: 20 msec
;; SERVER: 169.254.0.1#53(169.254.0.1) (UDP)
;; WHEN: Mon Sep 23 18:11:46 CEST 2024
;; MSG SIZE rcvd: 55
Seems that there is something broken on the DNS cache my router maintains. The only explanation I can find is that pasta was using the first IP that was being found, which was not working, and then it was not trying further (there are two other 84.116.* ips there)?
@dgibson seems I have gotten somewhere now. If I remove the IP of my router from the resolv.conf, it works:
; <<>> DiG 9.20.2 <<>> google.com @169.254.0.1 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49114 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 112 IN A 142.250.179.174 ;; Query time: 20 msec ;; SERVER: 169.254.0.1#53(169.254.0.1) (UDP) ;; WHEN: Mon Sep 23 18:11:46 CEST 2024 ;; MSG SIZE rcvd: 55
Seems that there is something broken on the DNS cache my router maintains. The only explanation I can find is that pasta was using the first IP that was being found, which was not working, and then it was not trying further (there are two other 84.116.* ips there)?
Yes, pasta forwards all queries to the first host-side resolv.conf entry, fall back to other servers isn't implemented there. It would actually be quite hard to do: we're forwarding at the packet level, and to interpret fallback we'd need to actually interpret what the queries mean to some extent, which we don't really want to do.
What I'm still baffled by is that you seemed to be able to query your router as nameserver from the host, but it failed via pasta. I'm not sure what difference could cause that.
I tried to reproduce this (same sets of routes, same addresses, with help of a dummy device) in a network namespace with pasta --config-net
, but everything works. I'll try in a VM next.
My focus is on why you'd get host unreachable with pasta for whatever DNS server while it's reachable by dig itself.
pasta has an internal concept of the "host DNS" which is where it directs queries to once it's forwarded them. But... it looks like the only way to configure that is via the host's resolv.conf, which is a bit of an oversight. @sbrivio-rh , did I miss something?
It's also possible to configure that using --dns
/ -D
. The problem is that it stopped working recently, it seems. If I try:
./pasta -f --config-net -D 185.12.64.1 --dns-forward 5.5.5.5
where 185.12.64.1 is the first resolver address I have in /etc/resolv.conf, a dig google.com @5.5.5.5
becomes:
bind(211, {sa_family=AF_INET, sin_port=htons(59628), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
epoll_ctl(3, EPOLL_CTL_ADD, 211, {events=EPOLLIN, data={u32=54022, u64=4295021318}}) = 0
connect(211, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
recvmmsg(211, 0x7ffe7ce2e9c0, 1024, MSG_DONTWAIT, NULL) = -1 EAGAIN (Resource temporarily unavailable)
sendmmsg(211, [{msg_hdr={msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, msg_namelen=16, msg_iov=[{iov_base="2z\1 \0\1\0\0\0\0\0\1\6google\3com\0\0\1\0\1\0\0)\4"..., iov_len=51}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=51}], 1, MSG_NOSIGNAL) = 1
that is, for some reason, --dns-forward
maps things to an unspecified address if that's overridden by -D
.
Ah, yes, that stopped working (intentionally) with commit 0b25cac94eca ("conf: Treat --dns addresses as guest visible addresses").
I see your reasons there, but it's fairly problematic that we can't override DNS resolvers for --dns-forward
with -D
. I think we need to either implement another option (say, --host-dns
or --dns-host
) or revert that commit.
I see your reasons there, but it's fairly problematic that we can't override DNS resolvers for
--dns-forward
with-D
. I think we need to either implement another option (say,--host-dns
or--dns-host
) or revert that commit.
This is implemented by https://archives.passt.top/passt-dev/20241003051402.2548424-1-david@gibson.dropbear.id.au/ by the way.
@flixman, as I can't reproduce this in a nested namespace, before trying to build something that looks like your setup and your router with VMs: could you capture DNS queries and responses (say, tcpdump -nvi eth0 port 53 -s0 -w dns.pcap
) on the upstream interface of the host, with your router address back into /etc/resolv.conf, while trying one (successful) query using dig and one (failing) from the container with pasta?
I'm trying to find out if for whatever reason dig gets an answer from another server, which is not your router, while pasta doesn't try further resolvers so it won't.
hey @sbrivio-rh my apologies for these last 10 days without giving live signals, we are going through a reorg here and everything is a bit chaotic. Give me a couple of days and I will try to reproduce this. Thank you!
@sbrivio-rh I have run the tests you requested, and here you have the traces:
working: dig google.com
dns_working.pcap.txt
failing: podman unshare --rootless-netns dig google.com
dns_failing.pcap.txt
@flixman thanks for the traces. Based on these it's actually looking like this might be a lot less mysterious than we thought.
The working trace shows a number of queries going to the home gateway 192.168.178.1
, without response, then a query and response to 84.116.46.21
, presumably a result of dig falling back to the next nameserver in the list. The failing trace is almost identical.
So my working theory is simply that DNS resolution never worked on the gateway, but while dig on the host was able to fall back to other nameservers, that doesn't happen under pasta. We're only listing the single virtual DNS resolver within the container, so dig itself can't fall back, and pasta can't fall back to secondary servers without a much more detailed understanding of what's going on with the queries than it possesses.
We can test this, by forcing dig on the host to only use the local gateway:
$ dig www.google.com @192.168.178.1
My expectation is that this will fail, much like dig inside pasta was failing. Failing on the host it may give a more meaningful error message: at present we don't propagate UDP errors seem on the host to ICMP errors that the guest can see. That should be possible for at least some cases, and we'd like to do it, but it's a non-trivial job so probably won't happen soon.
We're only listing the single virtual DNS resolver within the container, so dig itself can't fall back, and pasta can't fall back to secondary servers without a much more detailed understanding of what's going on with the queries than it possesses.
FYI, that is not true. We add other host resolvers to the container as well so fallback should be possible in theory for any client.
$ podman run --rm quay.io/libpod/testimage:20240123 cat /etc/resolv.conf
nameserver 169.254.1.1
nameserver 192.168.188.1 <--my host resolver
What is different here as mentioned in the original report if we use custom networks because they will use aardvark-dns by default, in that case ONLY the aardvark-dns ip will be in resolv.conf so no client can perform the retry. And while aardvark-dns did do a retry what it failed to do so in the past is to properly adjust the timeouts, so if the client timeout is 5s we also had a 5s timeout when we did forward the request. This meant that even though aardvark-dns tries further resolvers by the time we finally got the answer and try to respond to the client the client will long have closed it socket an gave up. This was fixed in https://github.com/containers/aardvark-dns/pull/514 (not in any release yet)
So if the issue is really the the first nameserver in resolv.conf is not working then I would say this is expected fow now until you have the new aardvark-dns version
@dgibson Indeed, it fails. Seems everything is clear, then. Thank you very much!
@Luap99 sorry, didn't mean to imply that multiple container nameservers is impossible with podman. I'd been under the impression that only one was listed in this particular setup, although looking back, I'm not sure that's correct either. But in any case, it seems like the issue is explained.
@flixman fwiw, we recently added a --dns-host
option to pasta to allow controlling where it forwards DNS queries on the host side, overriding the host's /etc/resolv.conf
. Once that reaches a packaged version it might be useful to you. Then again, since resolution on your router seems to simply not work, it's probably best to remove it from the host's resolv.conf
anyway (or else reconfigure the router so it does work).
Issue Description
Similarly to this issue, using run I can reach the internet:
However, should I create the network separately and then use it, I cannot do it:
returns
Error: authenticating creds for "<registry>": pinging container registry <registry>: Get "https://registry/v2/": dial tcp: lookup <registry>: Temporary failure in name resolution
Steps to reproduce the issue
Describe the results you received
The container cannot reach the internet
Describe the results you expected
The container works with a customized network the same it works with the default network.
podman info output
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
No
Additional environment details
No response
Additional information
When using the default network, when it works, I get an ip address on the address space of the host.