docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
753 stars 85 forks source link

Routing in container not working properly #288

Open gonvaled opened 6 years ago

gonvaled commented 6 years ago

Expected behavior

Routing from a container to an external host should work exactly the same as when running on the docker host.

Actual behavior

I have a container which:

Steps to reproduce the behavior

Output of docker version:

» docker version
Client:
 Version:      17.05.0-ce
 API version:  1.29
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.05.0-ce
 API version:  1.29 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:10:54 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

» docker info
Containers: 25
 Running: 7
 Paused: 0
 Stopped: 18
Images: 205
Server Version: 17.05.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 325
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-121-generic
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.721GiB
Name: polyphemus.wavilon.net
ID: 7Y4U:DWMH:XMZF:CHZM:WZ7Z:R3XN:WEZO:GIUK:H2GU:UKGK:4WEN:MEMS
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy: http://127.0.0.1:1234/
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 registry.dgvmetro
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.)

» lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial
cpuguy83 commented 6 years ago

Does the subnet in the container conflict with a network on the VPN? Is the VPN filtering the traffic? Have you done a trace?

gonvaled commented 6 years ago

General settings

The vpn routes, related to the machine I am pinging:

» route -n | grep '^10\.'
10.0.0.0        0.0.0.0         255.248.0.0     U     0      0        0 tunsnx
10.8.0.0        0.0.0.0         255.252.0.0     U     0      0        0 tunsnx
10.12.0.0       0.0.0.0         255.254.0.0     U     0      0        0 tunsnx
10.14.0.0       0.0.0.0         255.255.0.0     U     0      0        0 tunsnx
10.15.0.0       0.0.0.0         255.255.240.0   U     0      0        0 tunsnx
10.15.20.0      0.0.0.0         255.255.252.0   U     0      0        0 tunsnx
10.15.24.0      0.0.0.0         255.255.248.0   U     0      0        0 tunsnx
10.15.32.0      0.0.0.0         255.255.224.0   U     0      0        0 tunsnx
10.15.64.0      0.0.0.0         255.255.192.0   U     0      0        0 tunsnx
10.15.112.129   0.0.0.0         255.255.255.255 UH    0      0        0 tunsnx
10.15.128.0     0.0.0.0         255.255.128.0   U     0      0        0 tunsnx
10.16.0.0       0.0.0.0         255.240.0.0     U     0      0        0 tunsnx
10.32.0.0       0.0.0.0         255.224.0.0     U     0      0        0 tunsnx
10.64.0.0       0.0.0.0         255.192.0.0     U     0      0        0 tunsnx
10.128.0.0      0.0.0.0         255.128.0.0     U     0      0        0 tunsnx

The docker routes:

» route -n | grep br
172.21.0.0      0.0.0.0         255.255.0.0     U     0      0        0 br-1ad6626c6f69
172.30.0.0      0.0.0.0         255.255.0.0     U     0      0        0 br-ef87e1f9f7f5

And below the results from two tests (before each test, counters are reset)

Pinging from the container to a VPN address

The ping command:

root@b534761f685d:/# ping -f -c 1000 10.97.179.246   
PING 10.97.179.246 (10.97.179.246) 56(84) bytes of data.
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
--- 10.97.179.246 ping statistics ---
1000 packets transmitted, 0 received, 100% packet loss, time 12508ms

The counters:

Chain INPUT (policy ACCEPT 1852 packets, 160K bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1     1000 84000 DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
2        0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
3        0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
4     1000 84000 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
5        0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
6        0     0 ACCEPT     all  --  *      br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
7        0     0 DOCKER     all  --  *      br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
8        0     0 ACCEPT     all  --  br-ef87e1f9f7f5 !br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
9        0     0 ACCEPT     all  --  br-ef87e1f9f7f5 br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
10       0     0 ACCEPT     all  --  *      br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
11       0     0 DOCKER     all  --  *      br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
12       0     0 ACCEPT     all  --  br-1ad6626c6f69 !br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
13       0     0 ACCEPT     all  --  br-1ad6626c6f69 br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 1880 packets, 298K bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (3 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:443
2        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:80
3        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:22
4        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.5           tcp dpt:443
5        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.5           tcp dpt:80
6        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.3           tcp dpt:5000
7        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.4           tcp dpt:3142
8        0     0 ACCEPT     tcp  --  !br-1ad6626c6f69 br-1ad6626c6f69  0.0.0.0/0            172.21.0.2           tcp dpt:3141

Chain DOCKER-ISOLATION (1 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 DROP       all  --  br-1ad6626c6f69 docker0  0.0.0.0/0            0.0.0.0/0           
2        0     0 DROP       all  --  docker0 br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
3        0     0 DROP       all  --  br-ef87e1f9f7f5 docker0  0.0.0.0/0            0.0.0.0/0           
4        0     0 DROP       all  --  docker0 br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
5        0     0 DROP       all  --  br-ef87e1f9f7f5 br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
6        0     0 DROP       all  --  br-1ad6626c6f69 br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
7     1000 84000 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Nothing is dropped by iptables, but the ping does not work.

Pinging from the container to an internet address

The ping command:

root@b534761f685d:/# ping -f -c 1000 8.8.8.8   
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.

--- 8.8.8.8 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 9155ms
rtt min/avg/max/mdev = 7.201/9.439/32.915/2.619 ms, pipe 3, ipg/ewma 9.164/9.909 ms

The counters:

Chain INPUT (policy ACCEPT 571 packets, 73561 bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1     2000  168K DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
2     1000 84000 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
3        0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
4     1000 84000 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
5        0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
6        0     0 ACCEPT     all  --  *      br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
7        0     0 DOCKER     all  --  *      br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
8        0     0 ACCEPT     all  --  br-ef87e1f9f7f5 !br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
9        0     0 ACCEPT     all  --  br-ef87e1f9f7f5 br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
10       0     0 ACCEPT     all  --  *      br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
11       0     0 DOCKER     all  --  *      br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
12       0     0 ACCEPT     all  --  br-1ad6626c6f69 !br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
13       0     0 ACCEPT     all  --  br-1ad6626c6f69 br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 595 packets, 89773 bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (3 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:443
2        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:80
3        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:22
4        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.5           tcp dpt:443
5        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.5           tcp dpt:80
6        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.3           tcp dpt:5000
7        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.4           tcp dpt:3142
8        0     0 ACCEPT     tcp  --  !br-1ad6626c6f69 br-1ad6626c6f69  0.0.0.0/0            172.21.0.2           tcp dpt:3141

Chain DOCKER-ISOLATION (1 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 DROP       all  --  br-1ad6626c6f69 docker0  0.0.0.0/0            0.0.0.0/0           
2        0     0 DROP       all  --  docker0 br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
3        0     0 DROP       all  --  br-ef87e1f9f7f5 docker0  0.0.0.0/0            0.0.0.0/0           
4        0     0 DROP       all  --  docker0 br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
5        0     0 DROP       all  --  br-ef87e1f9f7f5 br-1ad6626c6f69  0.0.0.0/0            0.0.0.0/0           
6        0     0 DROP       all  --  br-1ad6626c6f69 br-ef87e1f9f7f5  0.0.0.0/0            0.0.0.0/0           
7     2000  168K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           
gonvaled commented 6 years ago

@cpuguy83 ping! :)

asgrim commented 5 years ago

@gonvaled did you manage to find a solution to this? We're seeing exactly the same behaviour on a corporate network with snxconnect too :cry:

synergiator commented 5 years ago

Same situation here with Docker version 19.03.1, build 74b1e89 (Ubuntu 18.04). IP gets resolved, other remote hosts are reachable further from the container despite the VPN, no conflicting routes.

karakays commented 4 years ago

Looks like the problem still remains but there is no update on it?

buehner commented 4 years ago

I can confirm. Having the same problem (docker with snx) on a fresh Linux Mint 20, Docker 19.03.12. Connecting to another network with openvpn is no problem

fdevibe commented 4 years ago

I am seeing what looks like the same thing. We are four people who recently upgraded to Fedora 32, and we are now all facing the same, or a similar, issue. Docker, plain Docker bridged network and SNX VPN.

Trying a DNS request from a container to a server on the VPN, the request is transmitted on the docker0 interface (Docker brdge network) to the host, we observe that the UDP packets are transmitted on this interface as expected. Furthermore, they are even routed correctly to the tunsnx (VPN) interface. However, when examining the traffic on the tunsnx interface, we see that the packets' source IP address (in the IPv4 header) is now set to that of the physical network interface and not the address that belongs to the tunsnx interface, meaning that there is no way for the DNS server to send a response.

Details

159.XXX.XXX.XXX is a DNS server inside the VPN in question.

Command issued inside container:

$ dig @159.XXX.XXX.XXX github.com

Output from tcpdump -t -nn -n -i docker0 udp port 53:

IP 172.17.0.2.42994 > 159.XXX.XXX.XXX.53: 62059+ [1au] A? github.com. (51)

Output from tcpdump -t -nn -n -i tunsnx udp port 53:

IP 192.168.10.110.42994 > 159.XXX.XXX.XXX.53: 62059+ [1au] A? github.com. (51)

On the host, 192.168.10.110 is the address of the physical interface, the address of the tunsnx (VPN) interface is 159.XXX.XXX.YYY (same VPN as the DNS server, but not the same address), and the docker0 interface has 172.17.0.1.

fdevibe commented 4 years ago

After a significant amount of trial and error, we have narrowed down the problem to SNX in some settings. When looking at the configuration of the tunnel interface, the routing scope is set to 247, but only on our Fedora 32 systems (on Fedora 30 systems it is (correctly) set to global (which is 0)):

$ ip -d address show tunsnx
25: tunsnx: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1200 qdisc fq_codel state UNKNOWN group default qlen 100
    link/none  promiscuity 0 minmtu 68 maxmtu 65535 
    tun type tun pi off vnet_hdr off persist off numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    inet XXX.XXX.XXX.XXX peer XXX.XXX.XXX.YYY/32 scope 247 tunsnx
       valid_lft forever preferred_lft forever

Now, interfaces with non-global scope apparently won't be used for setting source IP address, I believe (without having dug too deep into it) this is the code in question: https://code.woboq.org/linux/linux/net/netfilter/nf_nat_masquerade.c.html#44

We have come up with two alternative workarounds. One is to add a SNAT rule before the MASQUERADING rule:

iptables -t nat -I POSTROUTING 1 -s <docker_subnet> --out-interface tunsnx -j SNAT --to-source <IP address of tunsnx>

The other workaround is to change scope for the interface. ip address change doesn't seem to be able to modify this, so we had to delete the interface and then re-create it. This means that we also had to recreate the routing table as deleting the tunnel interface clears this. We did this with ip route save / restore, it seems to work.

To conclude, if you have this issue, I'd recommend checking the output of ip address show <device> and look at the scope. If the scope is not global / 0, masquerading probably won't work.

kshcherban commented 3 years ago

@fdevibe thanks a lot, I have the same issue with SNX on Ubuntu 20.04.

gersondinis commented 3 years ago

Same issue here. (ip -d address show tunsnx, scope is set to 247) For some reason, right after booting Ubuntu 20.04, if I run docker first and then connect the VPN I get scope global, otherwise scope is set to 247. @fdevibe can you describe how you did the second workaround (ip route save / restore) to manually set tunsnx scope to global? Thanks in advance.

fdevibe commented 3 years ago

@gersondinis, this is what I currently use:

function get_interface_data {
    interface=$1
    ip -o address show $interface | awk -F ' +' '{print $4 " " $6 " " $8}'
}

LOCAL_ADDRESS_INDEX=0
PEER_ADDRESS_INDEX=1
SCOPE_INDEX=2

function set_global_scope_if_required {
    data=($(get_interface_data tunsnx))
    [ "${data[$SCOPE_INDEX]}" == "global" ] && return

    echo "Setting global routing scope"
    tmpfile=$(mktemp --suffix=snxwrapper-routes)
    sudo ip route save > $tmpfile
    sudo ip address del ${data[$LOCAL_ADDRESS_INDEX]} peer ${data[$PEER_ADDRESS_INDEX]} dev tunsnx
    sudo ip address add ${data[$LOCAL_ADDRESS_INDEX]} dev tunsnx peer ${data[$PEER_ADDRESS_INDEX]} scope global
    sudo ip route restore < $tmpfile 2>/dev/null
    rm $tmpfile
}

TBH, this way you might end up with a routing table that might not be 100% identical to the original, ip route restore does spit out a couple of warnings in my case, but I believe I end up with an equivalent table to the original.

gersondinis commented 3 years ago

Thank you a lot @fdevibe! It works really well. And I didn't notice any routing issues.

andresgarita-dev commented 3 years ago

I have exactly the same error. I was on Kubuntu 18.04 and no errors (scope global), now I've moved to 20.04 and the scope changed to 247. I also tried to manually copy /usr/bin/snx file from 18.04 but I got SNX: Routing table configuration failed. Try to reconnect.

@fdevibe What should I do with that code?

fdevibe commented 3 years ago

@andresgaritaf I have it in a script. First I connect using snx, and then, I call set_global_scope_if_required.

gersondinis commented 3 years ago

There is my steps, maybe it helps:

andresgarita-dev commented 3 years ago

@gersondinis thank you for the answer. I followed your steps and I got final message "Interface tunsnx is set to global scope. Done!", however once I run the function the connection is like dropped cause I can't navigate anymore, I have to snx -d and start over. :(

What build version are you using? I have build 800008074

gersondinis commented 3 years ago

Build 800010003

Fahl-Design commented 3 years ago

@gersondinis thank you! worked for me on Kernel: 5.11.0-7614 with SNX build 800010003 I made a gist from your script add added dry_run and debug output (just for fun nothing fancy) https://gist.github.com/Fahl-Design/ec1e066ec2ef8160d101dff96a9b56e8

hho commented 2 years ago

@gersondinis Thank you so much! This is awesome.

I only had to change the route save command to sudo ip route save dev ${interface} > ${tmpfile} – because Docker adds routes with linkdown flag and ip route restore doesn't work if dead or linkdown flagged routes are in the saved file. This might also be the problem @andresgaritaf hit (as the error message from ip route restore is suppressed by default).

hmtsoi commented 1 year ago

@gersondinis Thanks for writing up the function. I have modified a bit of your function so that it will work on zsh as well: https://gist.github.com/hmtsoi/83cc8be5358ef12d8a4029c25c954e19