Containers with IPv6 not accessible from the Internet (NDP proxy not working)

[x] This is a bug report
[?] This is a feature request
[x] I searched existing issues before opening this one

Expected behavior

When Docker with IPv6 (sharing an IPv6 /64 prefix between hosts and containers) is launching a container, I expect it to be accessible from the Internet and norther neighbour docker hosts. This should be possible when NDP proxying is configured (adding a container IPv6 address to IPv6 neighbour relation to the eth0 upstream interface). Otherwise the containers cannot be access outside the docker host.

08:51:10.160140 IP6 2a03:dc80:0:f44f:ff::2 > nonprod-web301.netic.dk: ICMP6, echo request, seq 9, length 64
08:51:10.160187 IP6 nonprod-web301.netic.dk > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2a03:dc80:0:f44f:ff::2, length 32
08:51:11.157148 IP6 nonprod-web301.netic.dk > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2a03:dc80:0:f44f:ff::2, length 32
08:51:11.157147 IP6 fe80::250:56ff:feb9:c6dd > fe80::250:56ff:feb9:c429: ICMP6, neighbor solicitation, who has fe80::250:56ff:feb9:c429, length 32
08:51:11.157367 IP6 fe80::250:56ff:feb9:c429 > fe80::250:56ff:feb9:c6dd: ICMP6, neighbor advertisement, tgt is fe80::250:56ff:feb9:c429, length 24
08:51:11.160486 IP6 2a03:dc80:0:f44f:ff::2 > nonprod-web301.netic.dk: ICMP6, echo request, seq 10, length 64

Actual behavior

A Docker host with a container IPv6 network (/80 prefix) in the same IPv6 subnet (/64) is not able to access other Docker hosts or from the Internet.

08:51:06.152450 IP6 fe80::250:56ff:feb9:c429 > nonprod-web301.netic.dk: ICMP6, neighbor solicitation, who has nonprod-web301.netic.dk, length 32
08:51:06.152492 IP6 nonprod-web301.netic.dk > fe80::250:56ff:feb9:c429: ICMP6, neighbor advertisement, tgt is nonprod-web301.netic.dk, length 24
08:51:06.153147 IP6 nonprod-web301.netic.dk > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2a03:dc80:0:f44f:ff::2, length 32
08:51:06.155722 IP6 2a03:dc80:0:f44f:ff::2 > nonprod-web301.netic.dk: ICMP6, echo request, seq 5, length 64
08:51:07.153144 IP6 nonprod-web301.netic.dk > nonprod-web301.netic.dk: ICMP6, destination unreachable, unreachable address 2a03:dc80:0:f44f:ff::2, length 112
08:51:07.153154 IP6 nonprod-web301.netic.dk > nonprod-web301.netic.dk: ICMP6, destination unreachable, unreachable address 2a03:dc80:0:f44f:ff::2, length 112
08:51:07.153161 IP6 nonprod-web301.netic.dk > nonprod-web301.netic.dk: ICMP6, destination unreachable, unreachable address 2a03:dc80:0:f44f:ff::2, length 112

The solution to be able to route to hosts on the /80 container network is to manual add each container to the NDP proxy table:

ip -6 neigh show
...
...
2a03:dc80:0:f44f::7 dev ens192 lladdr 00:50:56:b9:35:90 REACHABLE

The official docker docs are inconsisten here, as the old Docker v 17.09 (https://docs.docker.com/v17.09/engine/userguide/networking/default_network/ipv6/) describes how to get around the NDP proxy problem, by using ndppd to auto configure NDP proxy for containers. However, this solution is very unstable and not suited for production - we have experienced many "hick-ups" where IPv6 connectivity is lost. The current docker documentation (https://docs.docker.com/config/daemon/ipv6/) only describes that IPv6 should be enabled in docker - nothing else. After reading ALLOT on google, i have got the impression that docker-proxy should handle NDP proxy configuration, but it is not working.

Steps to reproduce the behavior

Assign an IPv6 /64 prefix to two hosts dunning latest docker-ce.
Carve out two /80 from the /64 prefix and assign each docker hosts to run containers.
Provision a container on each host.
Attempt to ping6 the host from the internet or ping6 the other container.

Output of docker version:

sudo docker version
Client: Docker Engine - Community
 Version:           19.03.5
 API version:       1.40
 Go version:        go1.12.12
 Git commit:        633a0ea838
 Built:             Wed Nov 13 07:50:12 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.5
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.12
  Git commit:       633a0ea838
  Built:            Wed Nov 13 07:48:43 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.10
  GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version:          1.0.0-rc8+dev
  GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Output of docker info:

Client:
 Debug Mode: false

Server:
 Containers: 12
  Running: 6
  Paused: 0
  Stopped: 6
 Images: 5
 Server Version: 19.03.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: journald
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339
 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.4.0-166-generic
 Operating System: Ubuntu 16.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.66GiB
 Name: nonprod-web301.x.netic.dk
 ID: XRBV:TEOI:PA2K:SWUL:XPSH:BZLR:AJDO:GWCY:X2T4:ZNP7:CBTH:QK62
 Docker Root Dir: /pack/docker/images
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.)

However, this solution is very unstable and not suited for production - we have experienced many "hick-ups" where IPv6 connectivity is lost.

I have seen this reported a fair amount. In my own experience this is when the external interface loses it's proxy_ndp kernel setting. This would happen when a container from Docker was started/stopped which adjusted the network (adding/removing a veth interface).

On my VPS instance triggered a cloud-init "hotplug" hook (udev rule) that renders the netplan config to a systemd-networkd interface config file and applies that again (even though no changes were made), resetting the interface settings.

In this case networkd has a setting that can keep proxy_ndp enabled, but netplan does not (which creates the config). I needed to add my own hook script (for networkd this is done with networkd-dispatcher package, other network managers have similar features for hook scripts), where I just check that the external interface (enp1s0 for me) is triggering the hook, and enable proxy_ndp this way keeps it enabled (there's probably some small downtime when it's not that could cause a brief connectivity issue for requests):

/etc/networkd-dispatcher/configured.d/ipv6-ndp.sh:

#! /bin/bash  

TARGET_IFACE='enp1s0'  

if [[ ${IFACE} == "${TARGET_IFACE}" ]]  
then  
 sysctl "net.ipv6.conf.${TARGET_IFACE}.proxy_ndp=1"  
fi

I think it is gotchas like above, where there is a lot of behind the scenes configuration going on (through layers of network config) that gets triggered by such a simple event as starting or stopping a container. Not likely something Docker can fix itself on it's end that easily AFAIK, as there are many different ways a system may configure and manage a network.

A better fix in my case at least, would be for Netplan to support configuring networkd with proxy_ndp=1, like they support with accept-ra: true for accept_ra=2 (additional gotcha, networkd enables this, but disables the kernel setting to manage it's own internal implementation, which can be confusing/unexpected at first glance).

Alternatively use ULA instead of GUA

If you don't need containers to be assigned publicly routable IPv6 GUA addresses, and instead are ok with a single IPv6 GUA on your external interface that you NAT to containers like you do with IPv4, then it is much simpler to just create a docker network with a ULA subnet, and assign containers IPv6 ULA addresses. These are private (just like the IPv4 addresses in docker networks typically are, thus requiring NAT), and avoids the NDP issue entirely.

If you have IPv4 in use, publishing ports will bypass any firewall config rules anyway and have containers conflict to bind the same port to the external interface public IPv4 address. The main benefit for IPv6 then isn't as useful unless you're on a IPv6 only host, but that's less likely as many still need to support IPv4 client connections?

I think IPv6 ULA networks weren't supported that well when this issue was created, but we've had ip6tables support in /etc/docker/daemon.json for a while now which has received fixes through the 20.10.x series since it was introduced. As the info from original report above shows, the docker host wasn't new enough at the time to support this, but should be capable of it today and is what I would recommend most adopt for IPv6 support.

docker / for-linux