lxc / incus

Powerful system container and virtual machine manager
https://linuxcontainers.org/incus
Apache License 2.0
2.68k stars 222 forks source link

Network forward routing breaks when used by container in target network #1175

Closed Linkster78 closed 1 month ago

Linkster78 commented 2 months ago

Required information

Issue description

Taking two containers in the same network, let's say "client" and "server". Assuming that a network forward exists forwarding connections to the public IP to the internal "server" container, the "client" will be able to connect directly to "server" but won't be able to connect to "server" through the public IP/network forward.

This might be expected behaviour, I must admit that I'm not extremely familiar with the Linux networking stack.

Steps to reproduce

  1. Setup both containers and the forward.
    
    export PUBLIC_IP="..."

incus admin init --minimal incus launch images:debian/bookworm server incus exec server -- apt install -y netcat-openbsd # used for testing incus copy server client && incus start client incus network forward create incusbr0 $PUBLIC_IP target_address=$(incus ls -c4 -fcsv server | awk '{print $1}')

2. Listen on a certain port on the "server" container, try to connect to it directly from the "client" container. Should work.
3. Listen on a certain port on the "server" container, try to connect to the $PUBLIC_IP from the outside. Should work.
4. Listen on a certain port on the "server" container, try to connect to the $PUBLIC_IP from the "client" container. Should not work.

# Information to attach

captured on the host, port 10 could be replaced with any other port...

10.36.104.232 is the "client" container

10.36.104.154 is the "server" container

178.156.132.168 is the public IP

$ tcpdump -ni any port 10 tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 05:39:18.875191 vethfa027476 P IP 10.36.104.232.60942 > 178.156.132.168.10: Flags [S], seq 2192346995, win 64240, options [mss 1460,sackOK,TS val 2531906438 ecr 0,nop,wscale 7], length 0 05:39:18.875199 incusbr0 In IP 10.36.104.232.60942 > 178.156.132.168.10: Flags [S], seq 2192346995, win 64240, options [mss 1460,sackOK,TS val 2531906438 ecr 0,nop,wscale 7], length 0 05:39:18.875237 incusbr0 Out IP 10.36.104.232.60942 > 10.36.104.154.10: Flags [S], seq 2192346995, win 64240, options [mss 1460,sackOK,TS val 2531906438 ecr 0,nop,wscale 7], length 0 05:39:18.875242 vethb451bc43 Out IP 10.36.104.232.60942 > 10.36.104.154.10: Flags [S], seq 2192346995, win 64240, options [mss 1460,sackOK,TS val 2531906438 ecr 0,nop,wscale 7], length 0 05:39:18.875261 vethb451bc43 P IP 10.36.104.154.10 > 10.36.104.232.60942: Flags [S.], seq 2012877585, ack 2192346996, win 65160, options [mss 1460,sackOK,TS val 1854899413 ecr 2531906438,nop,wscale 7], length 0 05:39:18.875263 vethfa027476 Out IP 10.36.104.154.10 > 10.36.104.232.60942: Flags [S.], seq 2012877585, ack 2192346996, win 65160, options [mss 1460,sackOK,TS val 1854899413 ecr 2531906438,nop,wscale 7], length 0 05:39:18.875270 vethfa027476 P IP 10.36.104.232.60942 > 10.36.104.154.10: Flags [R], seq 2192346996, win 0, length 0 05:39:18.875272 vethb451bc43 Out IP 10.36.104.232.60942 > 10.36.104.154.10: Flags [R], seq 2192346996, win 0, length 0

Linkster78 commented 1 month ago

Revisiting this with a fresh look, it seems like this is due to incus forward only performing DNAT and not SNAT, so when the SYN packet reaches the "server" container, the "server" container tries to continue the handshake by talking directly to the "client" since the source IP is preserved by the lack of SNAT and the "client" is in the same subnet.

Obviously, the "client" container didn't attempt to initiate a connection with the "server" container's IP, it tried to initiate it with the public forwarded IP, so the attempted connection is reset and the TCP handshake is never completed.

Is there a reason why SNAT'ing is not performed? Or am I missing something entirely?

Nevermind, I can see why that would be a problem by obfuscating the source IP address. I imagine that this might just be an edge case that shouldn't be expected to be supported by Incus, but I'm curious to hear your toughts.

(This could probably be fixed by applying SNAT if the connection is coming from the same subnet as the target_address, but that seems a bit inconsistent to me)