containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.43k stars 2.38k forks source link

A container publishing/listening to IPv6 interface leads to all connections going into time-out #7415

Closed ericzolf closed 3 years ago

ericzolf commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

A container publishing/listening to IPv6 interface leads to all connections going into time-out.

Steps to reproduce the issue:

  1. create as root a container publishing to IPv6 e.g. podman container run --name proxy-cnt --pod ncpod --volume certs:/etc/nginx/certs:ro,z --volume proxy-vhost:/etc/nginx/vhost.d:z --volume proxy-html:/usr/share/nginx/html:z --volume /var/lib/containers/sources/proxy.d:/etc/nginx/conf.d:ro,Z --volume /var/lib/containers/sources/proxy_dhparam.pem:/etc/nginx/dhparam/dhparam.pem:ro,Z --publish 80:80 --publish 443:443 --publish [::]:80:80 --publish [::]:443:443 --env com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy=true --network proxy-nw --detach nginx:stable-alpine

  2. try to reach https://[deed:beaf:etc] (i.e. through the IPv6 address of the host)

Describe the results you received:

The HTTPS connection hangs until it goes into a timeout:

curl: (28) Operation timed out after 300507 milliseconds with 0 out of 0 bytes received

And nothing appears in the logs of the of the container.

Describe the results you expected:

That the IPv6 connection works as well as the IPv4 connection.

Additional information you deem important (e.g. issue happens only occasionally):

  1. After one night with the pod running, the IPv4 access seemed to have stopped working as well. But I'm not sure and haven't had the chance to reproduce this behaviour.
  2. On IRC #podman, I've described only that I am publishing [::]:443:443 and it was sufficient for @Luap99 to reproduce the issue so it doesn't seem to be due to my particular setup.
  3. I have the similar setup with docker working since years. It's only about the external address being IPv6, the internal container networks are IPv4 (and the container started is a reverse proxy).
  4. ss -tulpen show that the container is listening on the right IPs & ports.
  5. firewalld is on but the port is open, and it didn't make any difference as I stopped the firewalld service.

Output of podman version:

Version:      2.0.0-rc7
API Version:  1
Go Version:   go1.13.4
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.15.0
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.18-1.module_el8.3.0+432+2e9cbcd8.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.18, commit: 4d7da6016270217928f56161842ad4367c88dbb6'
  cpus: 4
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: file
  hostname: pe140centos.home
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-227.el8.x86_64
  linkmode: dynamic
  memFree: 14331277312
  memTotal: 16439975936
  ociRuntime:
    name: runc
    package: runc-1.0.0-66.rc10.module_el8.3.0+401+88d810c7.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  rootless: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 25186791424
  swapTotal: 25186791424
  uptime: 16h 5m 16.77s (Approximately 0.67 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 6
    paused: 0
    running: 5
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 8
  runRoot: /var/run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 1
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.13.4
  OsArch: linux/amd64
  Version: 2.0.0-rc7

Package info (e.g. output of rpm -q podman or apt list podman):

podman-2.0.0-0.9.rc7.module_el8.3.0+432+2e9cbcd8.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes, Fedora 32, podman 2.0.4, httpd instead of nginx, 80 instead of 443 -> same result

podman run --publish 80:80 --publish '[::]:80:80' httpd
$ curl http://127.0.0.1
<html><body><h1>It works!</h1></body></html>
$ curl http://[::1]
(I didn't wait for the time-out, but a few minutes at least)

Additional environment details (AWS, VirtualBox, physical, etc.):

Physical box, CentOS 8 Stream

Luap99 commented 4 years ago

Just want to add that I can also reproduce this with an UDP connection.

  1. window: sudo podman run --rm -p "[::1]:80:80/udp" alpine nc -l -u -p 80 -s ::
  2. window: nc -u ::1 80 type something and you won't see the output in window 1 confirm that we are listing inside the container with sudo podman exec -it -l netstat -a confirm that we get output in window 1 if we connect inside the container sudo podman exec -it -l nc -u ::1 80

Tested with current master 4828455055010a1376f1e83832bfa34787f3a1e7

mheon commented 4 years ago

This only reproduces as root, correct? Not rootless?

If so, can you verify you have no IPv6 firewall rules (ip6tables -nvL should be empty) when no container is running?

On Sun, Aug 23, 2020, 05:09 Luap99 notifications@github.com wrote:

Just want to add that I can also reproduce this with an UDP connection.

  1. window: sudo podman run --rm -p "[::1]:80:80/udp" alpine nc -l -u -p 80 -s ::
  2. window: nc -u ::1 80 type something and you won't see the output in window 1 confirm that we are listing inside the container with sudo podman exec -it -l netstat -a confirm that we get output in window 1 if we connect inside the container sudo podman exec -it -l nc -u ::1 80

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/containers/podman/issues/7415#issuecomment-678749478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3AOCCXD5FSVCJ3JTIY3YLSCDMELANCNFSM4QIRMLYQ .

Luap99 commented 4 years ago

sudo ip6tables -nvL is empty when running no container and still empty after starting one.

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

rootless doesn't work at all:

$ podman run --rm -p "[::1]:80:80/udp" alpine nc -l -u -p 80 -s ::
Error: failed to expose ports via rootlessport: "address ::1:80: too many colons in address\n"
ericzolf commented 4 years ago

Same thing here with rootless:

$ podman run --publish 8080:80 --publish '[::]:8080:80' httpd
[...]
Error: failed to expose ports via rootlessport: "listen tcp: address :::8080: too many colons in address\n"

Also no difference in ip6tables with and without container running (isn't anyway nftable the right thing under Fedora?)

rhatdan commented 4 years ago

@mccv1r0 PTAL

mccv1r0 commented 4 years ago

nftables is used, but there is (should be) compatibility with iptables user mode application

Can you provide the output of the nat table:

sudo iptables -nvL -t nat | grep 80 and sudo ip6tables -nvL -t nat | grep 80 ?

Please adjust 80 to the port you actually --publish

Also, sudo ss -nltp | grep 80 (as before, adjust 80 for what you publish)

A copy of the conflist used will help too.

ericzolf commented 4 years ago

sudo iptables -nvL -t nat | grep 80 and sudo ip6tables -nvL -t nat | grep 80 ?

Result is empty for both.

Please adjust 80 to the port you actually --publish

As I use 8080, I didn't bother to adapt :smirk:

Also, sudo ss -nltp | grep 80 (as before, adjust 80 for what you publish)

Same result, nothing.

A copy of the conflist used will help too.

Only the default conflist but here you are. 87-podman-bridge.conflist.txt

I tried again all the tests with podman 2.0.6 on F32, root and rootless, no change to the (negative) results.

ericzolf commented 4 years ago

Sorry, the result was while the pod wasn't running (because I tried first rootless, and it just doesn't start), if I call the same commands while the root pod is running (on port 80), the result is different:

sudo podman run --publish 80:80 --publish '[::]:80:80' httpd
# in another terminal:
$ sudo iptables -nvL -t nat | grep 80
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       10.88.0.0/16         0.0.0.0/0            tcp dpt:80
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       127.0.0.1            0.0.0.0/0            tcp dpt:80
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 to:10.88.0.20:80
    0     0 CNI-DN-34fb603b7e9d1e07f78ee  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* dnat name: "podman" id: "28101fa6bec4afe54bc0e2444f3421bd546298176c2720d6387f8f588a11f618" */ multiport dports 80,80
$ sudo ip6tables -nvL -t nat | grep 80
$ sudo ss -nltp | grep 80
LISTEN 0      4096         0.0.0.0:80        0.0.0.0:*    users:(("conmon",pid=6721,fd=5)) 
LISTEN 0      4096            [::]:80           [::]:*    users:(("conmon",pid=6721,fd=6)) 
mccv1r0 commented 4 years ago

I tested on fedora 31 and this works.

The 87-podman-bridge.conflist.txt you linked to does NOT have an IPv6 address configured. Can you add one and try to curl or nc to this address and the published port?

Using loopback in IPv6 is a known issue: https://github.com/containernetworking/plugins/tree/master/plugins/meta/portmap#known-issues

ericzolf commented 4 years ago

It's the standard default network, and the last time I tried to modify it, it went horribly wrong but I can try this.

Do you have the format somewhere documented? The man-pages for podman-network (create or inspect) seem to know only the version 0.3.0 of the format where 0.4.0 is the current one, no explanation of what routes exactly are (inwards or outwards is my first question), and they definitely don't have any example on how to combine multiple ranges/subnets/routes, especially not mixing IPv4 and IPv6.

ericzolf commented 4 years ago

@mccv1r0 can you provide your configuration under F31?

mccv1r0 commented 4 years ago

My conflist below. Here's what I just did on f31. My global IPv6 address was redacted (I hope, still on 1st cup of coffee):

[mcc@node ~]$ ip addr show cni89
12: cni89: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether a2:c6:0b:a2:59:76 brd ff:ff:ff:ff:ff:ff
    inet 10.89.0.1/16 brd 10.89.255.255 scope global cni89
       valid_lft forever preferred_lft forever
    inet6 2600:xxxx:xxxx:xxxx::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::a0c6:bff:fea2:5976/64 scope link 
       valid_lft forever preferred_lft forever
[mcc@node ~]$ 

[mcc@node ~]$ sudo podman ps 
CONTAINER ID  IMAGE                           COMMAND  CREATED       STATUS         PORTS                   NAMES
b10930c303c6  localhost/socksink-sudo:latest           6 days ago    Up 6 days ago  0.0.0.0:8080->5222/tcp  socksink-v6-port
[mcc@node ~]$ 

[mcc@node ~]$ nc -v localhost 8080
Ncat: Version 7.80 ( https://nmap.org/ncat )
v4 works
^C
[mcc@node ~]$ nc -v 2600:3c03:e000:391::1 8080
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to 2600:xxxx:xxxx:xxxx::1:8080.
v6 works to bridge global addr
^C
[mcc@node ~]$ nc -v fe80::a0c6:bff:fea2:5976%cni89 8080
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to fe80::a0c6:bff:fea2:5976:8080.
v6 works to bridge LL addr
^C
[mcc@node ~]$ 

With LL address, you don't even need to update the default conflist, Here is what I have added to the default:

         "ipam":{  
            "type":"host-local",
            "ranges":[  
               [  
                  {  
                     "subnet":"10.89.0.0/16",
                     "gateway":"10.89.0.1"
                  }
               ],
               [  
                  {  
                     "subnet":"2600:xxxx:xxxx:xxxx::/64"
                  }
               ]
            ],

That can be a global IPv6 address provided/delegated to you, or you can use an ULA

ericzolf commented 4 years ago

Sorry for the delay. Anyway, first without changes to the default configuration:

$ sudo podman run --publish 80:80 --publish '[::]:80:80' httpd
# change console...
$ sudo firewall-cmd --add-service=http
$ sudo firewall-cmd --list-all  # looks good, service http is there...

$ ip a show cni-podman0
5: cni-podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether a6:e0:4f:f7:f3:79 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 brd 10.88.255.255 scope global cni-podman0
       valid_lft forever preferred_lft forever
    inet6 fe80::a4e0:4fff:fef7:f379/64 scope link 
       valid_lft forever preferred_lft forever

$ nc -v 10.88.0.1 80
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to 10.88.0.1:80.
GET /
HTTP/1.1 400 Bad Request
[... server swearing at me, that's OK ...]

$ nc -v fe80::a4e0:4fff:fef7:f379%cni-podman0 80
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to fe80::a4e0:4fff:fef7:f379:80.
GET /
[... no answer ...]

Could it be simply that the Apache server in the container isn't listening on IPv6 and hence nothing happens? This said, a very similar approach under Docker worked, could it be that they're silently doing a kind of re-direction from IPv6 (outside) to IPv4 (inside), but podman isn't doing it?

I haven't yet done the test with changing the CNI configuration because I'm struggling to understand which IPv6 range I could use, and I'd be more than happy to have a solution with IPv6 outside and IPv4 inside, like it used to work with Docker; I'm only using IPv6 because my provider forces me to...

ericzolf commented 3 years ago

I re-tried exactly the previous commands under Fedora 33 podman-2.1.1-12.fc33.x86_64 and not even IPv4 is working now. I've tried it even from within the httpd pod itself and there also the connection hangs. But even when not publishing to IPv6, does the connection to IPv4 hang... What has changed here? Note that the veth interface has no IPv4 attached, if it's in anyway relevant:

6: cni-podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 3a:96:c1:10:f9:90 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 brd 10.88.255.255 scope global cni-podman0
       valid_lft forever preferred_lft forever
    inet6 fe80::3896:c1ff:fe10:f990/64 scope link 
       valid_lft forever preferred_lft forever
8: vethdaaa7a31@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni-podman0 state UP group default 
    link/ether 4e:d9:54:d3:35:d4 brd ff:ff:ff:ff:ff:ff link-netns cni-24b88fb9-27bf-eea6-6ae9-721bc9435ce3
    inet6 fe80::b03c:40ff:fec5:b70c/64 scope link 
       valid_lft forever preferred_lft forever

I still have the default cni configuration, and I am utterly confused...

rhatdan commented 3 years ago

@mccv1r0 Thoughts?

mccv1r0 commented 3 years ago

On fedora 30 I get syntax error:

[mcambria@mcambria ipv6-rfcs]$ sudo podman run --publish 80:80 --publish '[::]:80:80' httpd 
Error: cannot resolve the TCP address: address :::80: too many colons in address
[mcambria@mcambria ipv6-rfcs]$

Trying on fedora 32:

A bit better, but I can't exec anything useful from that container. e.g. podman exec... ip addr show doesn't work. Can you get all the plumbing working with a debug image that has useful tools like iproute2 (or candnf install iproute`)

As you thought, you need to ensure that httpd is actually binding to IPv6 addresses. When I try that image on fedora 32, I get connection refused (using nc -6 to be sure.)

Will also need sudo ip6tables -nvL and sudo ip6tables -nvL -t nat to see what's going on

mccv1r0 commented 3 years ago

fedora 32, I just started a pod:

sudo podman run -d --network="ipv6test" --name=v6-port-5555 -p 5555:5222 socksink-sudo

I'm mapping 5555 on the host to 5222 in the pod. Note the network isn't the default, this conflist has dual stack configured.

$ sudo podman ps 
[sudo] password for mcc: 
CONTAINER ID  IMAGE                           COMMAND  CREATED        STATUS             PORTS                   NAMES
79f7c89d90aa  localhost/socksink-sudo:latest           6 minutes ago  Up 6 minutes ago   0.0.0.0:5555->5222/tcp  v6-port-5555

iptables is being used on this node currently; I can't change to firewalld.

I see the rules I expect:

[mcc@snark]$ sudo iptables -nvL -t nat | grep 5555
    1    60 CNI-HOSTPORT-SETMARK  tcp  --  *      *       10.66.0.0/16         0.0.0.0/0            tcp dpt:5555
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       127.0.0.1            0.0.0.0/0            tcp dpt:5555
    1    60 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:5555 to:10.66.0.2:5222
    1    60 CNI-DN-e08817a426397c7d48abc  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* dnat name: "ipv6test" id: "79f7c89d90aabb2f27a9e566569c0c65d799984b1bee3d2612bfe02c117cb6fb" */ multiport dports 5555
[mcc@snark socksink]$ sudo ip6tables -nvL -t nat | grep 5555
    1    80 CNI-HOSTPORT-SETMARK  tcp      *      *       fd00::1:8:0/112      ::/0                 tcp dpt:5555
    2   160 DNAT       tcp      *      *       ::/0                 ::/0                 tcp dpt:5555 to:[fd00::1:8:5]:5222
    2   160 CNI-DN-e08817a426397c7d48abc  tcp      *      *       ::/0                 ::/0                 /* dnat name: "ipv6test" id: "79f7c89d90aabb2f27a9e566569c0c65d799984b1bee3d2612bfe02c117cb6fb" */ multiport dports 5555
[mcc@snark]$ 

Here is what Linux has for device:

[mcc@snark ~]$ ip addr show v6test0
18: v6test0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d2:f6:a4:79:eb:ba brd ff:ff:ff:ff:ff:ff
    inet 10.66.0.1/16 brd 10.66.255.255 scope global v6test0
       valid_lft forever preferred_lft forever
    inet6 fd00::1:8:1/112 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::d0f6:a4ff:fe79:ebba/64 scope link 
       valid_lft forever preferred_lft forever
[mcc@snark ~]$ 

Things look like they behave:

[mcc@snark]$ nc -v -6 fd00::1:8:1  5555
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to fd00::1:8:1:5555.
v6 port mapping works
^C
[mcc@snark]$ nc -v -6 fe80::d0f6:a4ff:fe79:ebba%v6test0 5555
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to fe80::d0f6:a4ff:fe79:ebba:5555.
v6 portmapping on LL works
^C
[mcc@snark]$ nc -v -4 10.66.0.1 5555
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to 10.66.0.1:5555.
v4 port mapping works
^C
[mcc@snark]$ 

Not shown are the tests to ensure that the host can reach the pod via there container IP : port. Obviously that works too.

When I can track down a fedora 33 node I'll try, same for firewalld. But this should give you an idea of what I have that works. Note that the .confilist explicitly configures an IPv6 prefix. Loopback isn't supported for CNI. https://github.com/containernetworking/plugins/blob/master/plugins/meta/portmap/README.md#known-issues. In your @ericzolf last comment I don't see an IPv6 address assigned to cni-podman0.

ericzolf commented 3 years ago

I didn't attach an IPv6 address, one because I didn't need to care with docker, two because I don't have one to spend (OK, private range fdxx should work), and three because someone wrote that link local should be sufficient. Anyway, I'm currently trying to re-create a clean environment because I've lost track of what I've done (and something might be borked in the meantime), and then I'll share again my results. Thanks for your patience, networking isn't really my strength, and IPv6 doesn't make it easier...

ericzolf commented 3 years ago

OK, I've created a Vagrant environment to reproduce the issue: https://gitlab.com/EricPublic/miscericlaneous/-/tree/master/podman_playground (it's free to access, you just need to be logged in).

There are two VMs, one with Fedora 32, one with Fedora 33, but I didn't notice any difference.

$ ss -tulpen | grep 80
tcp   LISTEN 0      4096                            0.0.0.0:80        0.0.0.0:*    users:(("conmon",pid=1870,fd=5)) ino:33436 sk:a cgroup:/user.slice/user-1000.slice/session-5.scope <->                     
tcp   LISTEN 0      4096                               [::]:80           [::]:*    users:(("conmon",pid=1870,fd=6)) ino:33438 sk:d cgroup:/user.slice/user-1000.slice/session-5.scope v6only:1 <->            
$ sudo podman exec -it priceless_pasteur bash
apache2# apt update
apache2# apt install iproute2
apache2# ss -tulpen
Netid   State    Recv-Q   Send-Q     Local Address:Port     Peer Address:Port   
tcp     LISTEN   0        0                      *:80                  *:*       users:(("httpd",pid=1,fd=4)) ino:33561 sk:ae0c196f
apache2# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether ce:27:9f:ef:bf:5c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.88.0.4/16 brd 10.88.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fd01:2345:6789:88::2/96 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::cc27:9fff:feef:bf5c/64 scope link 
       valid_lft forever preferred_lft forever

Is your test container image available somewhere, that I start from there? Not sure what Debian understands under *:80 compared to 0.0.0.0:80 resp. [::]:80 but it sounds like each image would need to be adapted to work under podman with IPv6. That would be a major pain...

ericzolf commented 3 years ago

OK, the Debian in the container is the opinion that Apache is listening to IPv6:

root@eaf11aa99b9d:/usr/local/apache2# ss -tulpen -f inet6
Netid   State    Recv-Q   Send-Q     Local Address:Port     Peer Address:Port   
tcp     LISTEN   0        0                      *:80                  *:*       users:(("httpd",pid=1,fd=4)) ino:47452 sk:56245e3c
root@eaf11aa99b9d:/usr/local/apache2# ss -tulpen -f inet 
Netid   State   Recv-Q   Send-Q     Local Address:Port      Peer Address:Port   

At least, the iptable shows something:

# sudo iptables -nvL -t nat | grep
Usage: grep [OPTION]... PATTERNS [FILE]...
Try 'grep --help' for more information.
[root@oldfedo ~]# sudo iptables -nvL -t nat | grep 80
    8   485 CNI-0d13419054d80ee2cadd69ae  all  --  *      *       10.88.0.8            0.0.0.0/0            /* name: "podman" id: "eaf11aa99b9ddb10bbda114dab2ef54a953a8d8dc4a66c45e4b65d55fc157a64" */
Chain CNI-0d13419054d80ee2cadd69ae (1 references)
Chain CNI-DN-0d13419054d80ee2cadd6 (1 references)
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       10.88.0.0/16         0.0.0.0/0            tcp dpt:80
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       127.0.0.1            0.0.0.0/0            tcp dpt:80
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 to:10.88.0.8:80
    0     0 CNI-DN-0d13419054d80ee2cadd6  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* dnat name: "podman" id: "eaf11aa99b9ddb10bbda114dab2ef54a953a8d8dc4a66c45e4b65d55fc157a64" */ multiport dports 80,80
# sudo ip6tables -nvL -t nat | grep 80
    0     0 CNI-0d13419054d80ee2cadd69ae  all      *      *       fd01:2345:6789:88::5  ::/0                 /* name: "podman" id: "eaf11aa99b9ddb10bbda114dab2ef54a953a8d8dc4a66c45e4b65d55fc157a64" */
Chain CNI-0d13419054d80ee2cadd69ae (1 references)
Chain CNI-DN-0d13419054d80ee2cadd6 (1 references)
    0     0 CNI-HOSTPORT-SETMARK  tcp      *      *       fd01:2345:6789:88::/96  ::/0                 tcp dpt:80
    0     0 DNAT       tcp      *      *       ::/0                 ::/0                 tcp dpt:80 to:[fd01:2345:6789:88::5]:80
    0     0 CNI-DN-0d13419054d80ee2cadd6  tcp      *      *       ::/0                 ::/0                 /* dnat name: "podman" id: "eaf11aa99b9ddb10bbda114dab2ef54a953a8d8dc4a66c45e4b65d55fc157a64" */ multiport dports 80,80

But curl continues to work for IPv4 but not IPv6... I have no clue what I'm still missing.

ericzolf commented 3 years ago

OK, I used nginx instead of httpd and the situation is clear: podman run --publish 80:80 --publish '[::]:80:80' nginx outputs the explicit support of IPv6:

/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf

And within the container I can see (it looks like Debian uses * instead of [::] for IPv6):

# ss -tulpen
Netid State  Recv-Q Send-Q Local Address:Port   Peer Address:Port                                                     
tcp   LISTEN 0      0            0.0.0.0:80          0.0.0.0:*     users:(("nginx",pid=1,fd=7)) ino:23725 sk:4cfc61ff 
tcp   LISTEN 0      0                  *:80                *:*     users:(("nginx",pid=1,fd=8)) ino:23726 sk:30682767 
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 9e:b9:d5:58:86:7a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.88.0.3/16 brd 10.88.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fd01:2345:6789:88::2/96 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::9cb9:d5ff:fe58:867a/64 scope link 
       valid_lft forever preferred_lft forever

Outside of it, in the VM:

# iptables -nvL -t nat | grep 80
    6   365 CNI-1ad6dd5c50ba46231ffd5830  all  --  *      *       10.88.0.3            0.0.0.0/0            /* name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            10.88.0.0/16         /* name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */
    6   365 MASQUERADE  all  --  *      *       0.0.0.0/0           !224.0.0.0/4          /* name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */
    2   120 CNI-HOSTPORT-SETMARK  tcp  --  *      *       10.88.0.0/16         0.0.0.0/0            tcp dpt:80
    3   180 CNI-HOSTPORT-SETMARK  tcp  --  *      *       127.0.0.1            0.0.0.0/0            tcp dpt:80
    8   480 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 to:10.88.0.3:80
    8   480 CNI-DN-1ad6dd5c50ba46231ffd5  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* dnat name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */ multiport dports 80,80
# ip6tables -nvL -t nat | grep 80
    0     0 CNI-1ad6dd5c50ba46231ffd5830  all      *      *       fd01:2345:6789:88::2  ::/0                 /* name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */
    0     0 ACCEPT     all      *      *       ::/0                 fd01:2345:6789:88::/96  /* name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */
    0     0 MASQUERADE  all      *      *       ::/0                !ff00::/8             /* name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */
    0     0 CNI-HOSTPORT-SETMARK  tcp      *      *       fd01:2345:6789:88::/96  ::/0                 tcp dpt:80
    1    80 DNAT       tcp      *      *       ::/0                 ::/0                 tcp dpt:80 to:[fd01:2345:6789:88::2]:80
    1    80 CNI-DN-1ad6dd5c50ba46231ffd5  tcp      *      *       ::/0                 ::/0                 /* dnat name: "podman" id: "fa496787c804df63cfc1b31bb872a8aed9562266f245e941f252d50ff16896fc" */ multiport dports 80,80

Further from the VM:

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:ec:53:17 brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname ens5
    inet 192.168.121.101/24 brd 192.168.121.255 scope global dynamic noprefixroute eth0
       valid_lft 2864sec preferred_lft 2864sec
    inet6 fd01:2345:6789:121::472d/128 scope global dynamic noprefixroute 
       valid_lft 85667sec preferred_lft 85667sec
    inet6 fe80::85f5:5dc4:a5b5:87c7/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: cni-podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 62:67:ee:d3:f5:54 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 brd 10.88.255.255 scope global cni-podman0
       valid_lft forever preferred_lft forever
    inet6 fd01:2345:6789:88::1/96 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::6067:eeff:fed3:f554/64 scope link 
       valid_lft forever preferred_lft forever
4: veth4ca0ee2c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni-podman0 state UP group default 
    link/ether 2e:42:67:61:bd:81 brd ff:ff:ff:ff:ff:ff link-netns cni-579ae6b6-9e98-683b-7874-8bf584d76d36
    inet6 fe80::f8:c7ff:fe6f:204a/64 scope link 
       valid_lft forever preferred_lft forever

$ curl localhost  # works
$ curl 127.0.0.1  # works
$ curl [::1]  # hangs (expected behaviour, I know)
$ curl 10.88.0.3  # works
$ curl [fd01:2345:6789:88::2]  # works!
$ curl 10.88.0.1 # works
$ curl [fd01:2345:6789:88::1]  # works!
$ curl 192.168.121.101  # works!
$ curl [fd01:2345:6789:121::472d]  # hangs!

So what am I overseeing?

ericzolf commented 3 years ago

As a side remark: it seems that httpd was only listening on IPv6 and nevertheless, IPv4 worked and not IPv6 on the public interface. It's strange but it also seems to hint at the fact that the inner life of the container should be irrelevant (which would be aligned with how Docker behaves). As another side remark: curl on a non existing IPv6 address gives quickly a No route to host error message.

ericzolf commented 3 years ago

I tried with a different image, fedora/nginx, just to be sure I wasn't hitting some strange Debian incompatibility but the result is exactly the same, everything works except the public IPv6.

mccv1r0 commented 3 years ago

$ curl localhost # works $ curl 127.0.0.1 # works $ curl [::1] # hangs (expected behaviour, I know) $ curl 10.88.0.3 # works $ curl [fd01:2345:6789:88::2] # works! $ curl 10.88.0.1 # works $ curl [fd01:2345:6789:88::1] # works! $ curl 192.168.121.101 # works! $ curl [fd01:2345:6789:121::472d] # hangs!

Same as prior expected behavior. CNI only setup IPv6 port forwarding on fd01:2345:6789:88::1 cni-podman0 interface. This address is for the eth0 interface. Everything is working.

For IPv4 there is a rule:

3 180 CNI-HOSTPORT-SETMARK tcp -- * * 127.0.0.1 0.0.0.0/0 tcp dpt:80 for loopback. IIRC the networking stack treats any packet sent to the local node (regardless of address) as LOCAL

This isn't possible for IPv6 as shown earlier.

ericzolf commented 3 years ago

So that means, it's not possible under Podman to publish a containerized application over IPv6, or am I misunderstanding something? And it sounds even like it's definitely not possible, which seems very wrong, as it's possible with Docker out-of-the-box.

mheon commented 3 years ago

This should work - I remember adding tests for it when I initially added support for the [::] syntax for forcing us to bind to IPv6 ports

mccv1r0 commented 3 years ago

This is using the host's cni-podman0 interface, published port -p 80

$ curl [fd01:2345:6789:88::1] # works!

ericzolf commented 3 years ago

But this is rather useless, the CNI interface isn't reachable from outside the host?!

mheon commented 3 years ago

That syntax should be instructing us to bind to all IPv6 addresses on the host, not just cni-podman0.

mheon commented 3 years ago

Sorry, not bind to - create port forwarding rules for.

(Technically, we ARE binding to the addresses too, to ensure that other apps trying to bind to them get an error; the ip/ip6tables rules take precedence so the traffic is forwarded, but we don't want to have port 80 forwarded to a container but let you accidentally start a server on port 80 on the host and wonder why you can't reach it).

mccv1r0 commented 3 years ago

But this is rather useless, the CNI interface isn't reachable from outside the host?!

Routing on your node/network is setup such that the CNI interface/subnets (v4 or v6) are not reachable from outside the host.

mccv1r0 commented 3 years ago

That syntax should be instructing us to bind to all IPv6 addresses on the host, not just cni-podman0.

@mheon CNI has this limitation See text just above link for reasoning. Open an issue if needed, maybe this will be revisited for CNI... but notice the lack of route_localnet sysctl for net.ipv6. The lack of kernel support for route_localnet is the cause if the limitation referred to.

IPv6 isn't IPv4 with larger addresses. Things like RFC1918 and especially NAT were e.g. mistakes/workarounds that were not supposed to be used in IPv6.

That said, Linux does support NAT for IPv6 (sadly), k8s with IPv6 certainly needs it. Suggest opening an issue against portmap plugin, stress "docker does it" and see what happens.

mccv1r0 commented 3 years ago

Ouch:

Using --publish 5555:5222 on docker shows they solved the problem using docker-proxy

$ sudo ss -tnlape  | grep 5555
LISTEN    0         4096                     *:5555                   *:*        users:(("docker-proxy",pid=408439,fd=4)) ino:5059073 sk:9 v6only:0 <->         
$ ps ax | grep 5555
 408439 ?        Sl     0:00 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 5555 -container-ip 172.17.0.2 -container-port 5222
 408602 pts/0    S+     0:00 grep --color=auto 5555
$ 

I doubt that CNI will be getting into the port mapping proxy business for IPv6.

ericzolf commented 3 years ago

Yep, my current docker-based setup has the same: one docker-proxy process for port 80 and one for port 443.

So it sounds like my only option is to install a reverse proxy server on the host, to expose the ports I need to be public...

ericzolf commented 3 years ago

Could a macvlan network be also an alternative? (BTW, the podman-network-create man page still states that only 'bridge' driver is supported)

mheon commented 3 years ago

We implemented macvlan a bit whacky - it's not a driver, but a separate flag. AFAIK it should work, though, since you have a "real" public interface to play with?

mccv1r0 commented 3 years ago

So it sounds like my only option is to install a reverse proxy server on the host, to expose the ports I need to be public...

And every time a pod changes IP address, change the reverse proxy config.... more "backends" are added, change the reverse proxy config...

Is routing IPv6 a non-starter? On example of something that is expected to happen... I have cloud based VM's running podman that get an IPv6 address for eth0 and supply a /48 for all my podman networks. I only need a few, but with IPv6 addresses are not the problem. Being assigned e.g. /48, /56 or /64 (depending on provider) is typical.

ericzolf commented 3 years ago

I'm open for alternatives, I just have no clue how to realize what you just described. Do you have somewhere an example documented? I've always done containers un-routed.

ericzolf commented 3 years ago

I tried macvlan according to https://www.redhat.com/sysadmin/leasing-ips-podman (new code uploaded to my Git repo)- it works for IPv4 but /usr/libexec/cni/dhcp doesn't seem able to pull an IPv6 address (at least not from my reading of https://github.com/containernetworking/plugins/tree/master/plugins/ipam/dhcp).

ericzolf commented 3 years ago

podman run --network host nginx would work, but the man page for podman run states:

the host mode gives the container full access to local system services such as D-bus and is therefore considered insecure;

which doesn't sound optimal for a public facing container.

The last option I see is to use macvlan with an own dhclient in the container, IPv6 abled, or a statically assigned IPv6 address, or perhaps what you proposed @mccv1r0 but I have no clue what you meant, and searching for routed container network didn't bring me further. Thanks and sorry for being so clueless.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

ericzolf commented 3 years ago

OK, let me summarize the situation:

I'm a bit at loss what to make with this issue...

zem commented 3 years ago

I did ran into the same problem. Forwarding ipv6 as root via cni was no problem 6-8 month ago.

You had to add a podman6 ipv6 network like so:

[root@cloudgit gitlab.pod]# cat /etc/cni/net.d/88-podman6-bridge.conflist 
{
  "cniVersion": "0.4.0",
  "name": "podman6",
  "plugins": [
    {
      "type": "bridge",
      "bridge": "cni-podman0",
      "isGateway": true,
      "ipMasq": true,
      "ipam": {
        "type": "host-local",
        "routes": [{ "dst": "::/0" }],
        "ranges": [
          [
            {
              "subnet": "fdc2:4ba9:85d4:f3c1::/64",
              "gateway": "fdc2:4ba9:85d4:f3c1::1"
            }
          ]
        ]
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    },
    {
      "type": "firewall"
    },
    {
      "type": "tuning"
    }
  ]
}

And publish the port 80 and I remember Definately that it did IPv6 (dualstack) by using: --network podman,podman6 --publish 80 It took me a while to figure out the cni config part as due to the linux network stack not being able to nat64 on its own, the container needs to get an ipv6, and cni is not able to run dualstack on the same network ID. (it can do on the same bridge though)

However it seems to have stopped working with recent podman/cni versions and none of my ipv6 containers does publish ipv6 address now.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

mccv1r0 commented 3 years ago

CNI doesn't and never did use nat64 and I doubt it ever will. I'm confused why nat64 was brought up at all.

I took the exact config you used above, and change a few names to xxx so as to not conflict what what I already have running. Note this config is not dual stack, just IPv6.

$ sudo cat /etc/cni/net.d/xxx.conflist
{
  "cniVersion": "0.4.0",
  "name": "xxx",
  "plugins": [
    {
      "type": "bridge",
      "bridge": "xxx",
      "isGateway": true,
      "ipMasq": true,
      "ipam": {
        "type": "host-local",
        "routes": [{ "dst": "::/0" }],
        "ranges": [
          [
            {
              "subnet": "fdc2:4ba9:85d4:f3c1::/64",
              "gateway": "fdc2:4ba9:85d4:f3c1::1"
            }
          ]
        ]
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    },
    {
      "type": "firewall"
    },
    {
      "type": "tuning"
    }
  ]
}

$

I don't have any problem. I used port 8080 since this node already has a web server on 80, and I map to 5222 in the container since that's what this image happens to use.

$ sudo podman run -d --network="xxx" --name=xxx8080 -p 8080:5222 socksink-sudo
b9b6a5e3b7dfbfa577c807ab228e436d02eb060c738549ec9c80478d02d224a4
$
$ sudo ip6tables -nvL -t nat | grep 8080
    1    80 CNI-HOSTPORT-SETMARK  tcp      *      *       fdc2:4ba9:85d4:f3c1::/64  ::/0                 tcp dpt:8080
    1    80 DNAT       tcp      *      *       ::/0                 ::/0                 tcp dpt:8080 to:[fdc2:4ba9:85d4:f3c1::3]:5222
    1    80 CNI-DN-f2ca50ec51741a0f7e634  tcp      *      *       ::/0                 ::/0                 /* dnat name: "xxx" id: "b9b6a5e3b7dfbfa577c807ab228e436d02eb060c738549ec9c80478d02d224a4" */ multiport dports 8080
$
$ nc -v fdc2:4ba9:85d4:f3c1::1 8080
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Connected to fdc2:4ba9:85d4:f3c1::1:8080.
works for me?
Ncat: 14 bytes sent, 0 bytes received in 6.81 seconds.
[mcc@snark ~]$ sudo podman logs xxx8080
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Listening on :::5222
Ncat: Listening on 0.0.0.0:5222
Ncat: Connection from fdc2:4ba9:85d4:f3c1::1.
Ncat: Connection from fdc2:4ba9:85d4:f3c1::1:35128.
works for me?
$

In the above example I'm using fedora 32 and iptables as the backend.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 3 years ago

I believe from reading @mccv1r0 that this is not an issue and am closing. Reopen if I am mistaken.

tdashton commented 2 years ago

@ericzolf did you solve this issue? I am having the same problem. I noticed the following behavior: (I also cannot connect to any podman ipv6 fowrwarded ports) and believe it to be an issue with the ip6tables configuration when fowarding ipv6 ports via podman. See details below.

@rhatdan not sure this issue is cleared up, maybe we could re-open / visit it?

It appears that podman doesn't set up the correct ip6tables rules when doing a fowarding such as

podman run --security-opt seccomp=unconfined --rm -p "10080:10080/tcp" alpine nc -v -l -p 10080 -s 0.0.0.0

opens an ipv4 port

ss -nltp
LISTEN      0      128                            *:10080                                      *:*                   
users:(("conmon",pid=26595,fd=5))

and ends up with nat rules in the iptables -nvL -t nat tables.

[root@thinkcentre ~]# iptables -nvL -t nat
...truncated
Chain CNI-DN-ce80177e875d9b9172ecf (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       10.88.0.0/16         0.0.0.0/0            tcp dpt:10080
    0     0 CNI-HOSTPORT-SETMARK  tcp  --  *      *       127.0.0.1            0.0.0.0/0            tcp dpt:10080
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:10080 to:10.88.0.55:10080

Chain CNI-HOSTPORT-DNAT (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 CNI-DN-ce80177e875d9b9172ecf  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* dnat name: "podman" id: "5c50c038383ebc53664751e438a55c00489b50399ff5c1e603bc94304473fb33" */ multiport dports 10080
... truncated

for comparison

podman run --security-opt seccomp=unconfined --rm -p "[::]:10080:10080/tcp" alpine nc -v -l -p 10080 -s ::

opens an ipv6 port

ss -nltp
LISTEN      0      128                         [::]:10080                                   [::]:*                   
users:(("conmon",pid=26351,fd=5),("podman",pid=26257,fd=17),("podman",pid=26257,fd=16))

and one iptables reference

Chain CNI-HOSTPORT-DNAT (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 CNI-DN-3192b43e38b82a13ee2b3  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* dnat name: "podman" id: "e09814c98e029341f08f8c03e544abc1b84b6decfcfbc455d6195cfbfd6cf110" */ multiport dports 10080

and no related ip6tables entries

[root@thinkcentre ~]# ip6tables -nvL -t nat | grep 10080
[root@thinkcentre ~]#

Could it be that podman only needs to configure the nat rules for ip6tables?

ericzolf commented 2 years ago

@ericzolf did you solve this issue?

No, I gave up and decided to use the host-network. Not the most secure approach, it keeps me from making the service internet-facing, but the best I could reach so far.

Luap99 commented 2 years ago

@tdashton You cannot forward ipv6 connections if your container has no ipv6 network. If you need ipv6 support you need to add a ipv6 subnet to your network config.