containers / netavark

Container network stack
Apache License 2.0
533 stars 84 forks source link

The DNS resolution in container has different behavior with and without "--dns" added in "podman run" command, even the /etc/resolv.conf have the same content in container #855

Open 839735495 opened 11 months ago

839735495 commented 11 months ago

ENV [Rootful] aardvark-dns-1.5.0-2.module+el8.8.0+19993+47c8ef84.x86_64 netavark-1.5.1-2.module+el8.8.0+19993+47c8ef84.x86_64 networkBackend is netavark OS: RHEL 8.8

What is the issue The DNS resolution in container has different behavior with and without "--dns" added in "podman run" command, even the /etc/resolv.conf have the same content in container.

How to reproduce

  1. Create dual stack network using command: podman network create --driver=bridge --subnet=192.168.230.0/25 --ipv6 --subnet=fdf8:192:168:230::/121 foo

  2. Execute command to run two container: podman run -d -it --dns=192.168.230.1 --dns=fdf8:192:168:230::1 --dns=${my_dns} --name foo_1 --network foo ${my_image} podman run -d -it --name foo_2 --network foo ${my_image}

  3. My iptables and ip6tables rule:

    
    # iptables -nvL INPUT
    Chain INPUT (policy ACCEPT 923K packets, 113M bytes)
    pkts bytes target     prot opt in     out     source               destination
    46  3438 ACCEPT     udp  --  *      *       0.0.0.0/0            192.168.230.1        udp dpt:53
    34  4436 REJECT     all  --  *      *       0.0.0.0/0            192.168.230.0/25     reject-with icmp-port-unreachable

ip6tables -nvL INPUT

Chain INPUT (policy ACCEPT 533K packets, 64M bytes) pkts bytes target prot opt in out source destination 18 1728 ACCEPT udp ::/0 fdf8:192:168:230::1 udp dpt:53 36 5316 REJECT all ::/0 fdf8:192:168:230::/121 reject-with icmp6-port-unreachable


4. /etc/resolv.conf in foo_1 and foo_2 (same)

[root@05e297721834 /]# cat /etc/resolv.conf search dns.podman [my_search_domain] nameserver 192.168.230.1 nameserver fdf8:192:168:230::1 nameserver [my_dns]


5. In foo_1, DNS resolution very slow(8+ seconds). In foo_2, DNS resolution normal speed(1- seconds)

podman exec -it foo_1 bash

[root@05e297721834 /]# time getent ahosts ${outside_hostname} 10.188.186.222 STREAM ${outside_hostname}.${my_search_domain} 10.188.186.222 DGRAM 10.188.186.222 RAW

real 0m8.201s user 0m0.000s sys 0m0.003s [root@05e297721834 /]# exit exit

podman exec -it foo_2 bash

[root@a1f7f6d70e0f /]# time getent ahosts ${outside_hostname} 10.188.186.222 STREAM ${outside_hostname}.${my_search_domain} 10.188.186.222 DGRAM 10.188.186.222 RAW

real 0m0.196s user 0m0.000s sys 0m0.003s


**RCA**
Why in foo_1 is very slow.
Here is the part of tcpdump details when running getent in foo_1:

07:32:00.980500 IP 192.168.230.88.59859 > 192.168.230.1.domain: 21298+ A? net186-host222.${my_search_domain}. (48) 07:32:00.980539 IP 192.168.230.88.59859 > 192.168.230.1.domain: 15373+ AAAA? net186-host222.${my_search_domain}. (48) 07:32:05.985236 IP6 fdf8:192:168:230::58.39687 > fdf8:192:168:230::1.domain: 21298+ A? net186-host222.${my_search_domain}. (48) 07:32:05.985287 IP6 fdf8:192:168:230::58.39687 > fdf8:192:168:230::1.domain: 15373+ AAAA? net186-host222.${my_search_domain}. (48) 07:32:08.988425 IP 192.168.230.88.42389 > 172.16.8.12.domain: 21298+ A? net186-host222.${my_search_domain}. (48) 07:32:08.988548 IP 192.168.230.88.42389 > 172.16.8.12.domain: 15373+ AAAA? net186-host222.${my_search_domain}. (48) 07:32:09.181097 IP 172.16.8.12.domain > 192.168.230.88.42389: 21298 1/0/0 A 10.188.186.222 (64) 07:32:09.181449 IP 172.16.8.12.domain > 192.168.230.88.42389: 15373 0/1/0 (108) 07:32:11.176015 IP 192.168.230.1.domain > 192.168.230.88.59859: 21298* 1/0/0 A 10.188.186.222 (64) 07:32:11.176063 IP 192.168.230.88 > 192.168.230.1: ICMP 192.168.230.88 udp port 59859 unreachable, length 100 07:32:11.026209 IP6 fe80::4c6e:36ff:fe0b:5880 > fdf8:192:168:230::1: ICMP6, neighbor solicitation, who has fdf8:192:168:230::1, length 32 07:32:11.026262 IP6 fdf8:192:168:230::1 > fe80::4c6e:36ff:fe0b:5880: ICMP6, destination unreachable, unreachable port [|icmp6]

The DNS response blocked by the 2nd iptable/ip6table rule in INPUT chain and finally got result from my custom DNS

However, in foo_2, it was not blocked, it returned very fast:

07:36:41.221171 IP 192.168.230.89.34653 > 192.168.230.1.domain: 35145+ A? net186-host222.${my_search_domain}. (48) 07:36:41.221223 IP 192.168.230.89.34653 > 192.168.230.1.domain: 48714+ AAAA? net186-host222.${my_search_domain}. (48) 07:36:41.413830 IP 192.168.230.1.domain > 192.168.230.89.34653: 35145 1/0/0 A 10.188.186.222 (64) 07:36:41.413909 IP 192.168.230.1.domain > 192.168.230.89.34653: 48714 0/1/0 (108)



**Question**
What is the different behavior using --dns and without --dns when running "podman run"? Why in foo_2 the dns response not blocked by the iptables rule?
Luap99 commented 11 months ago

Please try with the latest podman and netavark verions.

xiaoyar commented 11 months ago

It's nothing to do with the firewall rule, the immediate cause is you specified the same dns_server as the network connected to the container through the --dns option when executing podman run.

For the sake of ease of description, let's use network-nameserver to refer to the dns_server of the network connected to the container.

Below is the root cause analysis.

Moving on, I think the first thing is that we should definitely not specify network-nameserver through --dns when executing podman run or podman create. On the other hand, we can also expect such improvement made in aardvark-dns, e.g. when forwarding DNS request to external name servers, it's better to check if any nameserver in this list is same as any local listen IP, and skip it if it's same. https://github.com/containers/aardvark-dns/blob/main/src/dns/coredns.rs#L332 What do you think @Luap99 ?

Luap99 commented 11 months ago

That's because container sends DNS request to network-nameserver, which's aardvark-dns, if it cannot be resolved, aardvark-dns will forward the request to external name servers, because of the new feature above, custom dns server specified by --dns will be the first choice, it's network-nameserver, so the request is forwarded to aardvark-dns itself, it's stuck!!!

Ah yes I missed that part, yeah you should not give us the aardvark-dns with --dns. You should only give the upstream resolvers via that flag. Ignoring our own listening ips sounds like a reasonable suggestion. Alternative would be to error out because we should not allow user to end up in a infinite recursion.

xiaoyar commented 11 months ago

Thank you @Luap99 for your prompt response.

Ignoring our own listening ips sounds like a reasonable suggestion. Alternative would be to error out because we should not allow user to end up in a infinite recursion.

Shall I open an aardvark-dns ticket to address this?

Luap99 commented 11 months ago

Yes please

xiaoyar commented 11 months ago

Sure, https://github.com/containers/aardvark-dns/issues/415 is filed.

xiaoyar commented 11 months ago

Per the current implementation, the name servers defined in host's /etc/resolv.conf will be populated in container's /etc/resolv.conf. However, when a container is connected to a network with dns_enabled is true, the name servers in host's /etc/resolv.conf will also be tried by aardvark-dns. That means, if the name server in host's /etc/resolv.conf fails to resolve, it will be tried more than once, that's bad for performance. It seems the name servers in host's /etc/resolv.conf should not be populated in container's /etc/resolv.conf when the container is connected to a network with dns_enabled is true. What do you think, @Luap99 ? Shall we open a podman issue to address this?

Luap99 commented 11 months ago

It seems the name servers in host's /etc/resolv.conf should not be populated in container's /etc/resolv.conf when the container is connected to a network with dns_enabled is true.

That is already the case with the latest version.