NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
16.46k stars 12.95k forks source link

Networking between docker containers fails unless firewall is disabled #298165

Open Ralith opened 3 months ago

Ralith commented 3 months ago

Describe the bug

Docker containers don't seem to be able to communicate amongst themselves unless networking.firewall.enable = false is set, which is not desirable for obvious reasons. Setting networking.firewall.trustedInterfaces = [ "docker0" ]; is not sufficient.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Clone e.g. https://github.com/quic-interop/quic-interop-runner
  2. Run any test, e.g. python3 run.py -d -s quic-go -c quic-go -t handshake
  3. Note errors in output: client | Downloading files failed: timeout: no recent network activity, Test: handshake took 18.309553s, status: TestResult.FAILED
  4. Repeat after switching to networking.firewall.enable = false. Note test success.

Expected behavior

Containers should be able to communicate with each other, allowing tests to pass.

Notify maintainers

@offlinehacker @vdemeester @periklis @amaxine

Metadata

 - system: `"x86_64-linux"`
 - host os: `Linux 6.1.63, NixOS, 23.11 (Tapir), 23.11.750.7c4c20509c43`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.1`
 - channels(root): `"nixos-23.11"`
 - channels(ralith): `""`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
SuperSandro2000 commented 3 months ago

Can you reproduce this problem only with docker run command and a base image and without a third party repo? What is the minimal reproducer?

Ralith commented 3 months ago

I've never had any luck trying to drive docker by hand, despite a few attempts. nc -l -u 0.0.0.0 1234 on one container, and nc -u <ip> 1234 on the other, then sending a few lines of input (i.e. attempting to exchange UDP packets) should be plenty to test, but cloning the cited repo is probably easier.

SuperSandro2000 commented 3 months ago

Did you try tge example you have with nc? Can you reproduce the issue with that?

Ralith commented 3 months ago

How would I do that? I have no experience with docker, I'm just trying to use this software that works fine elsewhere.

squat commented 1 month ago

Hi, from my tests, the issue is not that container-to-container communication is broken, but rather that the firewall drops forwarded packets. This causes problems with kind (e.g. https://github.com/kubernetes-sigs/kind/issues/3443) and other projects that that expect containers to act as gateways.

Here's a complete reproduction you can run in a single terminal:

# Setup a new Docker network so that we can resolve the container name to an IP address.
docker network create nixos-test
# Create a container named `one` and configure it to response to the additional IP address 10.5.0.1.
docker run --rm -d --net nixos-test --cap-add NET_ADMIN --name one alpine sh -c 'ip a add dev eth0 10.5.0.1; tail -f /dev/null'
# Create a container named `two`, configure it to route packets to 10.5.0.1 via `one`, and ping 10.5.0.1.
docker run --rm -it --name two --net nixos-test --cap-add NET_ADMIN alpine sh -c 'ip route add 10.5.0.1 via $(nslookup one | tail -n 2 | head -n1 | cut -f2 -d" "); ping 10.5.0.1'

If the NixOS firewall is enabled, then ping will show no output. If you leave the command running and disable the firewall, then ping will produce regular output.

To cleanup the test:

docker kill one two
docker network rm nixos-test