moby / libnetwork

networking for containers
Apache License 2.0
2.14k stars 880 forks source link

docker: firewall: interdocker communication broken when using internal networks #2647

Open svzieg opened 2 years ago

svzieg commented 2 years ago

What happened:

When creating containers connected through an internal network, the communication between these containers is not working.

What you expected to happen:

It shouldn't matter whether to use an internal-only network or a network with web access. The communication between containers attached to the network should work.

How to reproduce it (as minimally and precisely as possible):

docker network create --internal test
docker run --network test --name nginx -d nginx
docker run --network test curlimages/curl nginx

Failed to connect to nginx port 80 after 1 ms: Host is unreachable

But when using non internal network, everything works as expected and the curl image gets the default nginx page.

Anything else we need to know?:

We encountered that docker didn't add the interface to the docker firewalld group. By adding that interface to the "docker" or "internal" zone, the communication works again. Furthermore the outer-communication to the web is still blocked, which is expected when using "internal" networks.

It seems to be related to https://github.com/firewalld/firewalld/issues/844. But I don't know if its more a firewalld or a docker problem, so I opened 2 bug reports. See https://github.com/firewalld/firewalld/issues/887

Environment:

Schuwi commented 2 years ago

This issue also occurs with firewalld 0.8.2, see https://github.com/docker/for-linux/issues/1353. From what I can tell the problem is that the bridge adapter does not get assigned to any zone when the network is internal. Docker does not seem to interact with firewalld in any way for internal networks for that matter.

chkpnt commented 2 years ago

On a SLES 15 SP2 with docker 20.10.9 and firewalld 0.5.5, inter-container communication using an internal network works fine, although the interface isn't added to the docker-zone.

On a SLES 15 SP3 with docker 20.10.12 and firewalld 0.9.3, it doesn't work. Adding the interface to the zone fixes the issue.

But as the interface is created via docker-compose, it is not permanent and its name is unknown. So is there an other workaround, but to refrain from using an internal network?

UPDATE: Ah, I see. I could use something like this to get a static interface name:

networks:
  backend:
    # Internal network needs to be added to firewalld's docker-zone, see https://github.com/moby/libnetwork/issues/2647#issuecomment-1070981820
    internal: yes
    driver_opts:
      com.docker.network.bridge.name: br-internal
svzieg commented 2 years ago

@chkpnt that a great workaround to at least create reliable bridge networks that can be added to the firewall. actually we copied the bridge name and configured this with the auto generated name. As we don't recreate the network this is fine, but difficult to documentatie. Having a generalized name is the better solution.

Actually we only see this issue running docker on RHEL or CentOS ... Also on some systems this must be configured with the docker interface, on other it works with the "internal" zone.

ftc2 commented 1 year ago

internal networks are broken for me too. :(

lsb_release -d
Description:    Debian GNU/Linux 11 (bullseye)

docker --version
Docker version 20.10.17, build 100c701

firewall-cmd --version
0.9.3

cat /etc/firewalld/firewalld.conf | grep FirewallBackend
# FirewallBackend
FirewallBackend=nftables

iptables --version
iptables v1.8.7 (nf_tables)

edit: it seems to be working after changing firewalld's FirewallBackend to iptables!

msilveirabr commented 1 year ago

I've just hit this issue.

These versions are just to report that my test system is updated ( updated after I hit the issue ).

I've made extensive comparison of this fc37 box with an ubuntu 22.04 with ufw disabled, both icc enabled and disabled for the test network... IPTABLES -> found just 1 different rule (ACCEPT), tried to replicate the ubuntu fw state and there was no change... sysctl for netfilter and ipv4 -> no visible changes

The "trick" was indeed to get firewalld config set to use iptables instead of nftables...

sed -e '/^FirewallBackend=nftables/s/^\(.*\)$/#\1\n#Docker fix\nFirewallBackend=iptables/' /etc/firewalld/firewalld.conf

But it looks like there's only a tiny bit missing code somewhere....

BTW: Even with nftables, enabling docker swarm mode allows us to use the overlay driver, and the overlay driver in internal mode works fine with firewalld and nftables....

I'll try to gather more info as soon as I come back to debug this issue ( got some deadlines to handle now ).

LinAGKar commented 1 year ago

I didn't see any difference between iptables and nftables. I needed to add the Docker networks to the docker zone in firewalld.

msilveirabr commented 1 year ago

I didn't see any difference between iptables and nftables. I needed to add the Docker networks to the docker zone in firewalld.

Please, try this example:

---
  version: "3.8"
  ############################################## SERVIVES ##############################################
  services:
    network-testing1:
      container_name: ${COMPOSE_PROJECT_NAME}_busybox1
      entrypoint: [ "/bin/sh", "-c", "while :; do sleep 10; done" ]
      image: busybox:latest
      hostname: busybox1
      stop_grace_period: 1s
      restart: "no"
      networks:
        - backend_network
      sysctls:
        - net.ipv6.conf.all.disable_ipv6=1

    network-testing2:
      container_name: ${COMPOSE_PROJECT_NAME}_busybox2
      image: busybox:latest
      hostname: busybox2
      stop_grace_period: 1s
      restart: "no"
      entrypoint: [ "/bin/sh", "-c", "while :; do sleep 10; done" ]
      depends_on:
        - network-testing1
      sysctls:
        - net.ipv6.conf.all.disable_ipv6=1
      networks:
        backend_network:

  ############################################## NETWORKS ##############################################
  networks:
    backend_network:
      name: test111be
      driver: bridge
      internal: true
      driver_opts:
        com.docker.network.bridge.name: testbe111
        com.docker.network.bridge.enable_icc: 1

    frontend_network:
      name: test111fe
      driver: bridge
      driver_opts:
        com.docker.network.bridge.name: testfe111

Then, exec:

docker exec -it networking-test_busybox1 sh -c "ping -c1 busybox2 ;  ping -c1 busybox1 ; ping -c1 8.8.8.8"
docker exec -it networking-test_busybox2 sh -c "ping -c1 busybox1 ;  ping -c1 busybox2 ; ping -c1 8.8.8.8"

With nftables backend, the first ping command is not successful, with iptables backend, it is.

I tried to add the test111be network to the docker zone:

[ansible@dockerhostfc37.local ~]$ sudo firewall-cmd --zone=docker --add-interface=generic_be --permanent
success

UPDATED ANSWER (keeping what I wrote before for posterity)

It seems i have hit some issues with bridges not deleted, etc ( probably I missed a docker compose down before up again ) Setting icc to 0, works too ( isolates ).

FACT: adding the bridge interface to the docker zone fixes the behaviour of internal flag + com.docker.network.bridge.enable_icc driver option

@LinAGKar indeed, adding the interface to docker zone fixes this. Do you know if there is any upstream fix for this or something we can do to make docker automatically add this to the docker zone?

msilveirabr commented 1 year ago

Here's a fix to make dockerd add network bridge to docker zone when internal+icc enabled:

diff -udpr src.org/engine/libnetwork/drivers/bridge/setup_ip_tables.go src/engine/libnetwork/drivers/bridge/setup_ip_tables.go
--- src.org/engine/libnetwork/drivers/bridge/setup_ip_tables.go 2023-02-09 16:44:54.000000000 -0300
+++ src/engine/libnetwork/drivers/bridge/setup_ip_tables.go 2023-03-08 21:13:28.185709798 -0300
@@ -414,6 +414,11 @@ func setupInternalNetworkRules(bridgeIfa
    if err := programChainRule(version, outDropRule, "DROP OUTGOING", insert); err != nil {
        return err
    }
+   //Add internal interface to docker zone if ICC is enabled
+   if (icc) {
+       iiptable := iptables.GetIptable(version)
+       iiptable.SetupInternalICC(bridgeIface, insert)
+   }
    // Set Inter Container Communication.
    return setIcc(version, bridgeIface, icc, insert)
 }
diff -udpr src.org/engine/libnetwork/iptables/iptables.go src/engine/libnetwork/iptables/iptables.go
--- src.org/engine/libnetwork/iptables/iptables.go  2023-02-09 16:44:54.000000000 -0300
+++ src/engine/libnetwork/iptables/iptables.go  2023-03-08 21:08:00.274026294 -0300
@@ -173,6 +173,23 @@ func (iptable IPTable) LoopbackByVersion
    return "127.0.0.0/8"
 }

+// Add or remove the internal interface from the firewalld zone ( ICC )
+func (iptable IPTable) SetupInternalICC(bridgeIface string, include bool) error {
+   // Either add or remove the interface from the firewalld zone
+   if firewalldRunning {
+       if include {
+           if err := AddInterfaceFirewalld(bridgeIface); err != nil {
+               return fmt.Errorf("Failed to add interface %s to firewalld zone: %s", bridgeIface, err.Error())
+           }
+       } else {
+           if err := DelInterfaceFirewalld(bridgeIface); err != nil {
+               return fmt.Errorf("Failed to remove interface %s from firewalld zone: %s", bridgeIface, err.Error())
+           }
+       }
+   }
+   return nil
+}
+
 // ProgramChain is used to add rules to a chain
 func (iptable IPTable) ProgramChain(c *ChainInfo, bridgeName string, hairpinMode, enable bool) error {
    if c.Name == "" {

Had to add inline code because for some reason, github is not allowing me to attach the file.

This patch is working as intended, but it needs a look from some real programmer.