containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.54k stars 2.39k forks source link

Podman run sometimes fails to open ports #5541

Closed cognition9144 closed 4 years ago

cognition9144 commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

  1. podman run -d -p 7005:8080 quay.io/keycloak/keycloak:latest

  2. curl 127.0.0.1:7005 -v

Describe the results you received: Very often, it gives the following result. Sometimes it's normal before restart or rm and run again.

❯ podman run -d -p 7005:8080  quay.io/keycloak/keycloak:latest
108ec4724600c8e18f442b27b6b8f77a4bd14fff00bbb04d72bae7480b875c55

❯ curl 127.0.0.1:7005 -v
*   Trying 127.0.0.1:7005...
* Connected to 127.0.0.1 (127.0.0.1) port 7005 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:7005
> User-Agent: curl/7.69.1
> Accept: */*
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

❯ curl 127.0.0.1:7005 -v
*   Trying 127.0.0.1:7005...
* connect to 127.0.0.1 port 7005 failed: Connection refused
* Failed to connect to 127.0.0.1 port 7005: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 127.0.0.1 port 7005: Connection refused

Describe the results you expected: curl returns the source code of that page

Additional information you deem important (e.g. issue happens only occasionally): This issue happens only occasionally. Particularly, it's in favor of the following images:

However, it never touches (probably because it's in host mode):

Noticably, either podman or sudo podman triggers this issue. After it fails in rootless mode, if you check the same image (could be different container name), it still occurs.

However, it doesn't affect other images/containers.

Output of podman version:

Version:            1.8.2-dev
RemoteAPI Version:  1
Go Version:         go1.14rc1
Git Commit:         9e7648552cfa128b2195ccf33df718bf4dcb1ef8-dirty
Built:              Sat Aug  4 08:00:00 2018
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: 9e7648552cfa128b2195ccf33df718bf4dcb1ef8-dirty
  go version: go1.14rc1
  podman version: 1.8.2-dev
host:
  BuildahVersion: 1.15.0-dev
  CgroupVersion: v2
  Conmon:
    package: podman-1.8.2-1584388149.git2b2996d0.fc32.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.7, commit: 9e7648552cfa128b2195ccf33df718bf4dcb1ef8-dirty'
  Distribution:
    distribution: fedora
    version: "32"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 196767744
  MemTotal: 4020461568
  OCIRuntime:
    name: crun
    package: crun-0.13-1.fc32.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.13
      commit: e79e4de4ac16da0ce48777afb72c6241de870525
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  SwapFree: 4148948992
  SwapTotal: 4164939776
  arch: amd64
  cpus: 4
  eventlogger: file
  hostname: linux
  kernel: 5.6.0-0.rc5.git0.2.fc32.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.3-6.0.dev.gita8414d1.fc32.x86_64
    Version: |-
      slirp4netns version 0.4.3+dev
      commit: a8414d1d1629f6f7a93b60b55e183a93d10d9a1c
  uptime: 27h 50m 55.06s (Approximately 1.12 days)
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/fedora/.config/containers/storage.conf
  ContainerStore:
    number: 27
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.6-2.0.dev.gitce61c7e.fc32.x86_64
      Version: |-
        fusermount3 version: 3.9.0
        fuse-overlayfs: version 0.7.6
        FUSE library version 3.9.0
        using FUSE kernel interface version 7.31
  GraphRoot: /home/fedora/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 25
  RunRoot: /run/user/1000
  VolumePath: /home/fedora/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.8.2-1584388149.git2b2996d0.fc32.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

mheon commented 4 years ago

Does normal network traffic (not connecting to it via the forwarded port, but e.g. curling a file from within the container) work?

AkihiroSuda commented 4 years ago

containers-rootlessport seems to die when a connection occurs from the host before the port is created inside the container.

Minimal reproducer:

A weird thing is that no debug log is printed during the crash. Also, this is not reproducible with Rootless Docker even though the implementation is almost shared with Podman except the nsenter stuff.

cognition9144 commented 4 years ago

Does normal network traffic (not connecting to it via the forwarded port, but e.g. curling a file from within the container) work?

Yes. Inside such a container it can sucessfully curl outside:

bash-4.4$ curl 1.1.1.1
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>cloudflare-lb</center>
</body>
</html>
cognition9144 commented 4 years ago

Another thing to notice, when this problem occurs, network interface cni-podman0 is not presented. But today magically when I recreated the pod, port mapping worked, and later I found there is a network interface called cni-podman0.

mheon commented 4 years ago

Are you certain you're running without root? Can you add --log-level=debug to a podman run command that fails and pastebin the whole output? We should not have any dependencies on CNI when running without root.

giuseppe commented 4 years ago

PR here: https://github.com/containers/libpod/pull/5552

aleks-mariusz commented 4 years ago

Are you certain you're running without root? Can you add --log-level=debug to a podman run command that fails and pastebin the whole output? We should not have any dependencies on CNI when running without root.


I too have this issue.. Here's what my --log-level=debug looks like..

full debug output of my podman run.. ``` pi@raspberrypi:~ $ podman run -p 8080:80 --log-level=debug -d --name octoprint14 --device /dev/ttyUSB0:/dev/ttyUSB0 -v ./octoprint-data:/data docker.io/aleksmariusz/octoprint:1.4.0-20200315 DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/pi/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver vfs DEBU[0000] Using graph root /home/pi/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000/containers DEBU[0000] Using static dir /home/pi/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/pi/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] No store required. Not opening container store. DEBU[0000] Initializing event backend file DEBU[0000] using runtime "/usr/lib/cri-o-runc/sbin/runc" WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument DEBU[0000] Failed to add podman to systemd sandbox cgroup: exec: "dbus-launch": executable file not found in $PATH INFO[0000] running as rootless DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/pi/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver vfs DEBU[0000] Using graph root /home/pi/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000/containers DEBU[0000] Using static dir /home/pi/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/pi/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] [graphdriver] trying provided driver "vfs" DEBU[0000] Initializing event backend file DEBU[0000] using runtime "/usr/lib/cri-o-runc/sbin/runc" WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument DEBU[0000] parsed reference into "[vfs@/home/pi/.local/share/containers/storage+/run/user/1000/containers]docker.io/aleksmariusz/octoprint:1.4.0-20200315" DEBU[0000] parsed reference into "[vfs@/home/pi/.local/share/containers/storage+/run/user/1000/containers]@ed54200e51ae0bada984869bf989ce15b9da688b483b2aaad76d389f68afe6e3" DEBU[0001] exporting opaque data as blob "sha256:ed54200e51ae0bada984869bf989ce15b9da688b483b2aaad76d389f68afe6e3" INFO[0001] Using "amd64" (architecture) on "arm" host DEBU[0001] Adding image volume at /data DEBU[0001] User mount ./octoprint-data:/data options [] DEBU[0001] Using slirp4netns netmode DEBU[0001] No hostname set; container's hostname will default to runtime default DEBU[0001] Loading seccomp profile from "/usr/share/containers/seccomp.json" DEBU[0001] Adding mount /proc DEBU[0001] Adding mount /dev DEBU[0001] Adding mount /dev/pts DEBU[0001] Adding mount /dev/mqueue DEBU[0001] Adding mount /sys DEBU[0001] Adding mount /sys/fs/cgroup DEBU[0001] Adding mount /dev/ttyUSB0 DEBU[0001] setting container name octoprint14 DEBU[0001] created OCI spec and options for new container DEBU[0001] parsed reference into "[vfs@/home/pi/.local/share/containers/storage+/run/user/1000/containers]@ed54200e51ae0bada984869bf989ce15b9da688b483b2aaad76d389f68afe6e3" DEBU[0001] exporting opaque data as blob "sha256:ed54200e51ae0bada984869bf989ce15b9da688b483b2aaad76d389f68afe6e3" DEBU[0269] created container "8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" DEBU[0269] container "8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" has work directory "/home/pi/.local/share/containers/storage/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata" DEBU[0269] container "8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" has run directory "/run/user/1000/containers/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata" DEBU[0270] New container created "8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" DEBU[0270] container "8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" has CgroupParent "/libpod_parent/libpod-8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" DEBU[0270] mounted container "8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134" at "/home/pi/.local/share/containers/storage/vfs/dir/7c1761570617b7af18233bf1c32bc973ffc82a9ead56dd7e699e89b9183e3a1d" DEBU[0270] Created root filesystem for container 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 at /home/pi/.local/share/containers/storage/vfs/dir/7c1761570617b7af18233bf1c32bc973ffc82a9ead56dd7e699e89b9183e3a1d DEBU[0270] Made network namespace at /run/user/1000/netns/cni-f90a8d46-63ff-c746-30d9-b986047843d7 for container 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 DEBU[0270] slirp4netns command: /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox -c -e 3 -r 4 --netns-type=path /run/user/1000/netns/cni-f90a8d46-63ff-c746-30d9-b986047843d7 tap0 DEBU[0272] rootlessport: time="2020-03-17T13:07:24Z" level=info msg="starting parent driver" DEBU[0272] rootlessport: time="2020-03-17T13:07:24Z" level=info msg="opaque=map[builtin.socketpath:/run/user/1000/libpod/tmp/rootlessport281369407/.bp.sock builtin.readypipepath:/run/user/1000/libpod/tmp/rootlessport281369407/.bp-ready.pipe]" DEBU[0272] rootlessport: time="2020-03-17T13:07:24Z" level=info msg="starting child driver in child netns (\"/proc/self/exe\" [containers-rootlessport-child])" DEBU[0272] rootlessport: time="2020-03-17T13:07:24Z" level=info msg="waiting for initComplete" DEBU[0272] rootlessport: time="2020-03-17T13:07:24Z" level=info msg="initComplete is closed; parent and child established the communication channel" DEBU[0272] rootlessport: time="2020-03-17T13:07:24Z" level=info msg="exposing ports [{8080 80 tcp }]" DEBU[0273] rootlessport is ready DEBU[0273] rootlessport: time="2020-03-17T13:07:25Z" level=info msg=ready time="2020-03-17T13:07:25Z" level=info msg="waiting for exitfd to be closed" DEBU[0273] skipping loading default AppArmor profile (rootless mode) DEBU[0273] /etc/system-fips does not exist on host, not mounting FIPS mode secret DEBU[0273] set root propagation to "rslave" DEBU[0273] Created OCI spec for container 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 at /home/pi/.local/share/containers/storage/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata/config.json DEBU[0273] /usr/libexec/podman/conmon messages will be logged to syslog DEBU[0273] running conmon: /usr/libexec/podman/conmon args="[--api-version 1 -c 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 -u 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 -r /usr/lib/cri-o-runc/sbin/runc -b /home/pi/.local/share/containers/storage/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata -p /run/user/1000/containers/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata/pidfile -l k8s-file:/home/pi/.local/share/containers/storage/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog --conmon-pidfile /run/user/1000/containers/vfs-containers/8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/pi/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134]" WARN[0273] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for cpuset: mkdir /sys/fs/cgroup/cpuset: read-only file system [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied DEBU[0275] Received: 4104 INFO[0275] Got Conmon PID as 4093 DEBU[0275] Created container 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 in OCI runtime DEBU[0275] Starting container 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 with command [/usr/bin/entry.sh /usr/local/bin/python /usr/local/bin/supervisord -c /etc/supervisor/supervisord.conf] DEBU[0275] Started container 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 ```

I am trying to work out what might have changed that caused it to get into this state, but am running out of ideas (i've tried reverting every change since it worked). Could really use some help what might have changed, what to check for next?

pi@raspberrypi:~ $ podman ps -a
CONTAINER ID  IMAGE                                            COMMAND               CREATED         STATUS             PORTS                 NAMES
8abe143bceb5  docker.io/aleksmariusz/octoprint:1.4.0-20200315  /usr/local/bin/py...  22 minutes ago  Up 18 minutes ago  0.0.0.0:8080->80/tcp  octoprint14

I also checked to make sure the port isn't being bound inside the container..

pi@raspberrypi:~ $ podman exec octoprint14 netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:5000            0.0.0.0:*               LISTEN      13/python
tcp        0      0 127.0.0.1:9001          0.0.0.0:*               LISTEN      1/python
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      -

and that the port inside the container is listening properly..

pi@raspberrypi:~ $ podman exec octoprint14 nc -vz localhost 80
localhost (127.0.0.1:80) open

but outside the container, on the host, the port is not open :-(

pi@raspberrypi:~ $ nc -vz localhost 8080
nc: connect to localhost port 8080 (tcp) failed: Connection refused

Also i made sure slirp4netns is running (i'm not sure if there's a difference using that, and rootlesskit for port forwarding.. or how to change which implementation i use):

  PID USER      CPU% MEM% S  NI TTY       VIRT   RES   SHR   OOM NLWP   DISK READ  DISK WRITE  UTIME+   STIME+  START Command                                                                                                                                    
    1 root       0.0  1.7 S   0 ?        14952  4008  2644     0    1     no perm     no perm  0:02.95  0:05.31 Jan01 /sbin/init                                                                                                                                 
 4093 pi         0.0  0.7 S   0 ?        14928  1652  1408     7    2    0.00 B/s    0.00 B/s  0:00.00  0:00.06 13:07 `- /usr/libexec/podman/conmon --api-version 1 -c 8abe143bceb537673e5d6ccb2fa0548c2b7c7770efd7715cd5ca885b4972e134 -u 8abe143bceb537673e5d6c
 4104 pi         0.0  5.6 S   0 ?        15804 12964  4500    55    1    0.00 B/s    0.00 B/s  0:12.54  0:01.39 13:07 |  `- /usr/local/bin/python /usr/local/bin/supervisord -c /etc/supervisor/supervisord.conf
 4129 pi         6.2 23.3 S   0 ?        70700 54136  5660   233   20    0.00 B/s    0.00 B/s  4:45.24  0:39.58 13:26 |     `- /usr/local/bin/python /usr/local/bin/octoprint --iknowwhatimdoing --basedir /data
 4128 100100     0.0  1.6 S   0 ?         6140  3788  2772    16    1    0.00 B/s    0.00 B/s  0:00.36  0:00.15 13:26 |     `- /usr/sbin/haproxy -db -f /etc/haproxy/haproxy.cfg
 4071 pi         0.0  1.1 S   0 pts/1     5780  2524  1596    10    1    0.00 B/s    0.00 B/s  0:00.21  0:00.89 13:26 `- /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox -c -e 3 -r 4 --netns-type=path /run/user/1000/netns/cni-f90a8d
  747 pi         0.0  0.5 S   0 ?        41232  1064   888     4    1    0.00 B/s    0.00 B/s  0:00.01  0:00.01 11:35 `- podman
  354 pi         0.0  1.3 S   0 ?        14512  3108  2204    13    1    0.00 B/s    0.00 B/s  0:00.53  0:00.19 11:34 `- /lib/systemd/systemd --user
  369 pi         0.0  1.1 S   0 ?        16420  2492  1032    10    1     no perm     no perm  0:00.00  0:00.00 11:34 |  `- (sd-pam)
I'm using podman 1.8.1 ``` Version: 1.8.1 RemoteAPI Version: 1 Go Version: go1.11.6 OS/Arch: linux/arm ```
aleks-mariusz commented 4 years ago
So here's what it looks like when rolled back my rootfs to when it did work ``` pi@raspberrypi:~ $ podman run -p 8080:80 -d --log-level=debug --name octoprint14 --device /dev/ttyUSB0:/dev/ttyUSB0 -v ./octoprint-data:/data docker.io/aleksmariusz/octoprint:1.4.0-202003062053 DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/pi/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver vfs DEBU[0000] Using graph root /home/pi/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000/containers DEBU[0000] Using static dir /home/pi/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/pi/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] No store required. Not opening container store. DEBU[0000] Initializing event backend file DEBU[0000] using runtime "/usr/lib/cri-o-runc/sbin/runc" WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument DEBU[0000] Failed to add podman to systemd sandbox cgroup: dial unix /run/user/1000/bus: connect: no such file or directory INFO[0000] running as rootless DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/pi/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver vfs DEBU[0000] Using graph root /home/pi/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000/containers DEBU[0000] Using static dir /home/pi/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/pi/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] [graphdriver] trying provided driver "vfs" DEBU[0000] Initializing event backend file DEBU[0000] using runtime "/usr/lib/cri-o-runc/sbin/runc" WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument DEBU[0000] parsed reference into "[vfs@/home/pi/.local/share/containers/storage+/run/user/1000/containers]docker.io/aleksmariusz/octoprint:1.4.0-202003062053" DEBU[0000] parsed reference into "[vfs@/home/pi/.local/share/containers/storage+/run/user/1000/containers]@0660822def88823aa898bdf60b0b9f1efcc34fcb4b0135ee79295bca05352ff9" DEBU[0000] exporting opaque data as blob "sha256:0660822def88823aa898bdf60b0b9f1efcc34fcb4b0135ee79295bca05352ff9" INFO[0000] Using "amd64" (architecture) on "arm" host DEBU[0000] Adding image volume at /data DEBU[0000] User mount ./octoprint-data:/data options [] DEBU[0001] Using slirp4netns netmode DEBU[0001] No hostname set; container's hostname will default to runtime default DEBU[0001] Loading seccomp profile from "/usr/share/containers/seccomp.json" DEBU[0001] Adding mount /proc DEBU[0001] Adding mount /dev DEBU[0001] Adding mount /dev/pts DEBU[0001] Adding mount /dev/mqueue DEBU[0001] Adding mount /sys DEBU[0001] Adding mount /sys/fs/cgroup DEBU[0001] Adding mount /dev/ttyUSB0 DEBU[0001] setting container name octoprint14 DEBU[0001] created OCI spec and options for new container DEBU[0001] Allocated lock 0 for container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd DEBU[0001] parsed reference into "[vfs@/home/pi/.local/share/containers/storage+/run/user/1000/containers]@0660822def88823aa898bdf60b0b9f1efcc34fcb4b0135ee79295bca05352ff9" DEBU[0001] exporting opaque data as blob "sha256:0660822def88823aa898bdf60b0b9f1efcc34fcb4b0135ee79295bca05352ff9" DEBU[0246] created container "96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" DEBU[0246] container "96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" has work directory "/home/pi/.local/share/containers/storage/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata" DEBU[0246] container "96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" has run directory "/run/user/1000/containers/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata" DEBU[0247] New container created "96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" DEBU[0247] container "96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" has CgroupParent "/libpod_parent/libpod-96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" DEBU[0247] mounted container "96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd" at "/home/pi/.local/share/containers/storage/vfs/dir/a7cdbaff650274d3e767dc863a32897c0442a521ac63d84d53822ad0ee175736" DEBU[0247] Created root filesystem for container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd at /home/pi/.local/share/containers/storage/vfs/dir/a7cdbaff650274d3e767dc863a32897c0442a521ac63d84d53822ad0ee175736 DEBU[0247] Made network namespace at /run/user/1000/netns/cni-4fd6eadc-f81f-e1c6-d4ee-0de3561eb1bd for container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd DEBU[0247] slirp4netns command: /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox -c -e 3 -r 4 --netns-type=path /run/user/1000/netns/cni-4fd6eadc-f81f-e1c6-d4ee-0de3561eb1bd tap0 DEBU[0248] rootlessport: time="2020-03-17T17:53:26Z" level=info msg="starting parent driver" DEBU[0248] rootlessport: time="2020-03-17T17:53:26Z" level=info msg="opaque=map[builtin.socketpath:/run/user/1000/libpod/tmp/rootlessport279464613/.bp.sock builtin.readypipepath:/run/user/1000/libpod/tmp/rootlessport279464613/.bp-ready.pipe]" DEBU[0248] rootlessport: time="2020-03-17T17:53:26Z" level=info msg="starting child driver in child netns (\"/proc/self/exe\" [containers-rootlessport-child])" DEBU[0248] rootlessport: time="2020-03-17T17:53:26Z" level=info msg="waiting for initComplete" DEBU[0249] rootlessport: time="2020-03-17T17:53:27Z" level=info msg="initComplete is closed; parent and child established the communication channel" DEBU[0249] rootlessport: time="2020-03-17T17:53:27Z" level=info msg="exposing ports [{8080 80 tcp }]" DEBU[0250] rootlessport is ready DEBU[0250] rootlessport: time="2020-03-17T17:53:28Z" level=info msg=ready time="2020-03-17T17:53:28Z" level=info msg="waiting for exitfd to be closed" DEBU[0251] skipping loading default AppArmor profile (rootless mode) DEBU[0252] /etc/system-fips does not exist on host, not mounting FIPS mode secret DEBU[0252] set root propagation to "rslave" DEBU[0252] Created OCI spec for container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd at /home/pi/.local/share/containers/storage/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata/config.json DEBU[0252] /usr/libexec/podman/conmon messages will be logged to syslog DEBU[0252] running conmon: /usr/libexec/podman/conmon args="[--api-version 1 -c 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd -u 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd -r /usr/lib/cri-o-runc/sbin/runc -b /home/pi/.local/share/containers/storage/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata -p /run/user/1000/containers/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata/pidfile -l k8s-file:/home/pi/.local/share/containers/storage/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog --conmon-pidfile /run/user/1000/containers/vfs-containers/96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/pi/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd]" WARN[0252] Failed to add conmon to cgroupfs sandbox cgroup: error creating cgroup for blkio: mkdir /sys/fs/cgroup/blkio/libpod_parent: permission denied [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied DEBU[0254] Received: 1883 INFO[0254] Got Conmon PID as 1872 DEBU[0254] Created container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd in OCI runtime DEBU[0254] Starting container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd with command [/usr/bin/entry.sh /usr/local/bin/python /usr/local/bin/supervisord -c /etc/supervisor/supervisord.conf] DEBU[0255] Started container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd ```

I've looked through the difference in debug log-level, but only thing i noticed different is the line is missing when it's not working:

DEBU[0001] Allocated lock 0 for container 96fc8003e676cc1d1e7a6c4f094635448dddbf454679f09cabb2cda0080157bd

Is that just a red-herring or does it mean something?

When it's working, querying on the host, shows the expected:

$ sudo ss -tnlp
State                      Recv-Q                     Send-Q                                          Local Address:Port                                            Peer Address:Port                                                                            
LISTEN                     0                          128                                                 127.0.0.1:8125                                                 0.0.0.0:*                         users:(("netdata",pid=246,fd=26))                     
LISTEN                     0                          128                                                   0.0.0.0:19999                                                0.0.0.0:*                         users:(("netdata",pid=246,fd=4))                      
LISTEN                     0                          128                                                   0.0.0.0:8080                                                 0.0.0.0:*                         users:(("exe",pid=1016,fd=10))                        
LISTEN                     0                          128                                                   0.0.0.0:22                                                   0.0.0.0:*                         users:(("sshd",pid=277,fd=3))

Pid 1016 is:

  PID USER      CPU% MEM% S  NI TTY       VIRT   RES   SHR   OOM NLWP   DISK READ  DISK WRITE  UTIME+   STIME+  START Command                                                                                                                                    
 1016 pi         0.0  3.7 S   0 pts/0     881M  8472  3496    36    9    0.00 B/s    0.00 B/s  0:00.70  0:00.20 17:36 `- containers-rootlessport
 1023 pi         0.0  3.6 S   0 pts/0     873M  8456  3452    36    8    0.00 B/s    0.00 B/s  0:00.59  0:00.23 17:36 |  `- containers-rootlessport-child

Any ideas on how to debug why containers-rootlessport is not even being ran (when the issue happens)..

AkihiroSuda commented 4 years ago

@aleks-mariusz Could you confirm whether https://github.com/containers/libpod/pull/5552 solves the issue?

cognition9144 commented 4 years ago

Are you certain you're running without root? Can you add --log-level=debug to a podman run command that fails and pastebin the whole output? We should not have any dependencies on CNI when running without root.

Yes. I realize that they do not actually corelated. Maybe the if is simply generated by sudo podman run sometime.

And the debug log is here ```bash ❯ podman pod create --name=calibre-web --share net -p 8083:8083 --log-level=debug DEBU[0000] Reading configuration file "/home/fedora/.config/containers/libpod.conf" DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/fedora/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /home/fedora/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000 DEBU[0000] Using static dir /home/fedora/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/fedora/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] No store required. Not opening container store. DEBU[0000] Initializing event backend journald DEBU[0000] using runtime "/usr/bin/runc" DEBU[0000] using runtime "/usr/bin/crun" WARN[0000] Error initializing configured OCI runtime kata-runtime: no valid executable found for OCI runtime kata-runtime: invalid argument WARN[0000] Error initializing configured OCI runtime kata-qemu: no valid executable found for OCI runtime kata-qemu: invalid argument WARN[0000] Error initializing configured OCI runtime kata-fc: no valid executable found for OCI runtime kata-fc: invalid argument DEBU[0000] using runtime "/usr/bin/crun" INFO[0000] running as rootless DEBU[0000] Reading configuration file "/home/fedora/.config/containers/libpod.conf" DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/fedora/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /home/fedora/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000 DEBU[0000] Using static dir /home/fedora/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/fedora/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] using runtime "/usr/bin/runc" DEBU[0000] using runtime "/usr/bin/crun" WARN[0000] Error initializing configured OCI runtime kata-runtime: no valid executable found for OCI runtime kata-runtime: invalid argument WARN[0000] Error initializing configured OCI runtime kata-qemu: no valid executable found for OCI runtime kata-qemu: invalid argument WARN[0000] Error initializing configured OCI runtime kata-fc: no valid executable found for OCI runtime kata-fc: invalid argument DEBU[0000] using runtime "/usr/bin/crun" DEBU[0000] Created cgroup path user.slice/user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice for parent user.slice and name libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826 DEBU[0000] Created cgroup user.slice/user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice DEBU[0000] Got pod cgroup as user.slice/user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice DEBU[0000] parsed reference into "[overlay@/home/fedora/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]k8s.gcr.io/pause:3.1" DEBU[0000] parsed reference into "[overlay@/home/fedora/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]@da86e6ba6ca197bf6bc5e9d900febd906b133eaa4750e6bed647b0fbe50ed43e" DEBU[0000] exporting opaque data as blob "sha256:da86e6ba6ca197bf6bc5e9d900febd906b133eaa4750e6bed647b0fbe50ed43e" DEBU[0000] Using ["/pause"] as infra container entrypoint DEBU[0000] Allocated lock 5 for container 8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8 DEBU[0000] parsed reference into "[overlay@/home/fedora/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]@da86e6ba6ca197bf6bc5e9d900febd906b133eaa4750e6bed647b0fbe50ed43e" DEBU[0000] exporting opaque data as blob "sha256:da86e6ba6ca197bf6bc5e9d900febd906b133eaa4750e6bed647b0fbe50ed43e" DEBU[0000] created container "8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8" DEBU[0000] container "8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8" has work directory "/home/fedora/.local/share/containers/storage/overlay-containers/8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8/userdata" DEBU[0000] container "8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8" has run directory "/run/user/1000/overlay-containers/8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8/userdata" 99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826 ❯ podman run --name=calibre-web_1 --log-level=debug -d --pod=calibre-web technosoft2000/calibre-web DEBU[0000] Reading configuration file "/home/fedora/.config/containers/libpod.conf" DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/fedora/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /home/fedora/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000 DEBU[0000] Using static dir /home/fedora/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/fedora/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] No store required. Not opening container store. DEBU[0000] Initializing event backend journald DEBU[0000] using runtime "/usr/bin/runc" DEBU[0000] using runtime "/usr/bin/crun" WARN[0000] Error initializing configured OCI runtime kata-runtime: no valid executable found for OCI runtime kata-runtime: invalid argument WARN[0000] Error initializing configured OCI runtime kata-qemu: no valid executable found for OCI runtime kata-qemu: invalid argument WARN[0000] Error initializing configured OCI runtime kata-fc: no valid executable found for OCI runtime kata-fc: invalid argument DEBU[0000] using runtime "/usr/bin/crun" INFO[0000] running as rootless DEBU[0000] Reading configuration file "/home/fedora/.config/containers/libpod.conf" DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/fedora/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /home/fedora/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000 DEBU[0000] Using static dir /home/fedora/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/fedora/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] using runtime "/usr/bin/runc" DEBU[0000] using runtime "/usr/bin/crun" WARN[0000] Error initializing configured OCI runtime kata-runtime: no valid executable found for OCI runtime kata-runtime: invalid argument WARN[0000] Error initializing configured OCI runtime kata-qemu: no valid executable found for OCI runtime kata-qemu: invalid argument WARN[0000] Error initializing configured OCI runtime kata-fc: no valid executable found for OCI runtime kata-fc: invalid argument DEBU[0000] using runtime "/usr/bin/crun" DEBU[0000] parsed reference into "[overlay@/home/fedora/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]docker.io/technosoft2000/calibre-web:latest" DEBU[0000] parsed reference into "[overlay@/home/fedora/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]@39d7ea6665fef95bc98da6bef15f423d87351295c7841c477d483536e161f471" DEBU[0000] exporting opaque data as blob "sha256:39d7ea6665fef95bc98da6bef15f423d87351295c7841c477d483536e161f471" DEBU[0000] unable to find flag net DEBU[0000] Adding image volume at /books DEBU[0000] Adding image volume at /calibre-web/app DEBU[0000] Adding image volume at /calibre-web/config DEBU[0000] Adding image volume at /calibre-web/kindlegen DEBU[0000] using container 8db3f7fc46b1c81558bbfd1b14098558f666c707fe502b08b62770666d70a8e8 netmode DEBU[0000] No hostname set; container's hostname will default to runtime default DEBU[0000] Loading seccomp profile from "/etc/crio/seccomp.json" DEBU[0000] setting container name calibre-web_1 DEBU[0000] adding container to pod calibre-web DEBU[0000] created OCI spec and options for new container DEBU[0000] Allocated lock 6 for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 DEBU[0000] parsed reference into "[overlay@/home/fedora/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs,overlay.mount_program=/usr/bin/fuse-overlayfs]@39d7ea6665fef95bc98da6bef15f423d87351295c7841c477d483536e161f471" DEBU[0000] exporting opaque data as blob "sha256:39d7ea6665fef95bc98da6bef15f423d87351295c7841c477d483536e161f471" DEBU[0000] created container "e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005" DEBU[0000] container "e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005" has work directory "/home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata" DEBU[0000] container "e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005" has run directory "/run/user/1000/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata" DEBU[0000] Creating new volume f79544ef1de6f3a6c5aad86a299b524a1923202a6c49f2904dd5353109171364 for container DEBU[0000] Validating options for local driver DEBU[0000] Creating new volume 2836875d817908b7f6a05fa14e37e612b210409154cd9794d86bdfa0201b2a3a for container DEBU[0000] Validating options for local driver DEBU[0000] Creating new volume bd9be61fac41d5cb80ed8e743007873ebabc2b14228523b280327367b35fd851 for container DEBU[0000] Validating options for local driver DEBU[0000] Creating new volume 988c1ef905a06942cecf81dcc7c8451cbf1d5ba514f5764b566be2c7f306ac34 for container DEBU[0000] Validating options for local driver DEBU[0000] New container created "e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005" DEBU[0000] container "e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005" has CgroupParent "user.slice/user-1000.slice/user@1000.service/user.slice/user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice/libpod-e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005.scope" DEBU[0000] overlay: mount_data=lowerdir=/home/fedora/.local/share/containers/storage/overlay/l/HKG7RJJ2ZFI44S5GOAWU43SCIQ:/home/fedora/.local/share/containers/storage/overlay/l/RNKSUQ4W7BDRAQUJG7DCX34MXE:/home/fedora/.local/share/containers/storage/overlay/l/4T4UDDAMVMH7EPZNDIWCR4HDUO:/home/fedora/.local/share/containers/storage/overlay/l/T4HZH7GNZFIIQA6LWKR2E3RP67:/home/fedora/.local/share/containers/storage/overlay/l/IPTACJVRCUIOPFE5GXMFRCDXX6:/home/fedora/.local/share/containers/storage/overlay/l/FPKSE3ELV7WHWXFBFJAS7VGMJN:/home/fedora/.local/share/containers/storage/overlay/l/ZO76QSJLMQQKUZ3B2QMAU6HMBB:/home/fedora/.local/share/containers/storage/overlay/l/HGSJ2MTF6NNUAF5OZWBHLRFYLY:/home/fedora/.local/share/containers/storage/overlay/l/LKAPF7L2THYBLSVPVDEHJT6BOD:/home/fedora/.local/share/containers/storage/overlay/l/2OAN7SBMFGTEICYHG2CZVFHDQG:/home/fedora/.local/share/containers/storage/overlay/l/MPCJQ7QM6LLK7B5LDL6REDZO6B:/home/fedora/.local/share/containers/storage/overlay/l/JAKDKEYL6Y27MJB365BFFFEG4A:/home/fedora/.local/share/containers/storage/overlay/l/4KBRTKDMKOYPUWFKGL7UD2JHQT:/home/fedora/.local/share/containers/storage/overlay/l/OCVT64XIW3WDAIYXB5KQDPEK43:/home/fedora/.local/share/containers/storage/overlay/l/FUP4ZYFFQQMLUNY3NCTFVPU2AA,upperdir=/home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/diff,workdir=/home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/work DEBU[0000] mounted container "e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005" at "/home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/merged" DEBU[0000] Copying up contents from container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 to volume f79544ef1de6f3a6c5aad86a299b524a1923202a6c49f2904dd5353109171364 DEBU[0000] Copying up contents from container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 to volume 2836875d817908b7f6a05fa14e37e612b210409154cd9794d86bdfa0201b2a3a DEBU[0000] Creating dest directory: /home/fedora/.local/share/containers/storage/volumes/2836875d817908b7f6a05fa14e37e612b210409154cd9794d86bdfa0201b2a3a/_data DEBU[0000] Calling TarUntar(/home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/merged/calibre-web/app, /home/fedora/.local/share/containers/storage/volumes/2836875d817908b7f6a05fa14e37e612b210409154cd9794d86bdfa0201b2a3a/_data) DEBU[0000] TarUntar(/home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/merged/calibre-web/app /home/fedora/.local/share/containers/storage/volumes/2836875d817908b7f6a05fa14e37e612b210409154cd9794d86bdfa0201b2a3a/_data) DEBU[0000] Copying up contents from container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 to volume bd9be61fac41d5cb80ed8e743007873ebabc2b14228523b280327367b35fd851 DEBU[0000] Copying up contents from container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 to volume 988c1ef905a06942cecf81dcc7c8451cbf1d5ba514f5764b566be2c7f306ac34 DEBU[0000] Created root filesystem for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 at /home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/merged DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode secret DEBU[0000] Setting CGroups for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 to user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice:libpod:e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 DEBU[0000] Created OCI spec for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 at /home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/config.json DEBU[0000] /usr/libexec/podman/conmon messages will be logged to syslog DEBU[0000] running conmon: /usr/libexec/podman/conmon args="[--api-version 1 -s -c e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 -u e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 -r /usr/bin/crun -b /home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata -p /run/user/1000/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/pidfile -l k8s-file:/home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog --conmon-pidfile /run/user/1000/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/fedora/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg /usr/bin/crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005]" INFO[0000] Running conmon under slice user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice and unitName libpod-conmon-e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005.scope [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied DEBU[0000] Received: 316252 INFO[0000] Got Conmon PID as 316248 DEBU[0000] Created container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 in OCI runtime DEBU[0000] Starting container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 with command [/bin/bash -c /init/start.sh] DEBU[0000] Started container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 ❯ podman restart calibre-web_1 --log-level=debug DEBU[0000] Reading configuration file "/home/fedora/.config/containers/libpod.conf" DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/fedora/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /home/fedora/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000 DEBU[0000] Using static dir /home/fedora/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/fedora/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] No store required. Not opening container store. DEBU[0000] Initializing event backend journald DEBU[0000] using runtime "/usr/bin/runc" DEBU[0000] using runtime "/usr/bin/crun" WARN[0000] Error initializing configured OCI runtime kata-qemu: no valid executable found for OCI runtime kata-qemu: invalid argument WARN[0000] Error initializing configured OCI runtime kata-fc: no valid executable found for OCI runtime kata-fc: invalid argument WARN[0000] Error initializing configured OCI runtime kata-runtime: no valid executable found for OCI runtime kata-runtime: invalid argument DEBU[0000] using runtime "/usr/bin/crun" INFO[0000] running as rootless DEBU[0000] Reading configuration file "/home/fedora/.config/containers/libpod.conf" DEBU[0000] Using conmon: "/usr/libexec/podman/conmon" DEBU[0000] Initializing boltdb state at /home/fedora/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /home/fedora/.local/share/containers/storage DEBU[0000] Using run root /run/user/1000 DEBU[0000] Using static dir /home/fedora/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp DEBU[0000] Using volume path /home/fedora/.local/share/containers/storage/volumes DEBU[0000] Set libpod namespace to "" DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] using runtime "/usr/bin/runc" DEBU[0000] using runtime "/usr/bin/crun" WARN[0000] Error initializing configured OCI runtime kata-runtime: no valid executable found for OCI runtime kata-runtime: invalid argument WARN[0000] Error initializing configured OCI runtime kata-qemu: no valid executable found for OCI runtime kata-qemu: invalid argument WARN[0000] Error initializing configured OCI runtime kata-fc: no valid executable found for OCI runtime kata-fc: invalid argument DEBU[0000] using runtime "/usr/bin/crun" DEBU[0000] Setting maximum workers to 8 DEBU[0000] Stopping ctr e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 (timeout 10) DEBU[0000] Stopping container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 (PID 316252) DEBU[0000] Sending signal 15 to container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 WARN[0010] Timed out stopping container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005, resorting to SIGKILL DEBU[0010] Sending signal 9 to container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 DEBU[0010] Created root filesystem for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 at /home/fedora/.local/share/containers/storage/overlay/adb5bcdc56d9a10e90b8807a863a4eeab6dafafe5810cb6f03c6fc517af18707/merged DEBU[0010] Recreating container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 in OCI runtime DEBU[0010] Successfully cleaned up container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 DEBU[0010] /etc/system-fips does not exist on host, not mounting FIPS mode secret DEBU[0010] Setting CGroups for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 to user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice:libpod:e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 DEBU[0010] Created OCI spec for container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 at /home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/config.json DEBU[0010] /usr/libexec/podman/conmon messages will be logged to syslog DEBU[0010] running conmon: /usr/libexec/podman/conmon args="[--api-version 1 -s -c e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 -u e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 -r /usr/bin/crun -b /home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata -p /run/user/1000/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/pidfile -l k8s-file:/home/fedora/.local/share/containers/storage/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket --log-level debug --syslog --conmon-pidfile /run/user/1000/overlay-containers/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/fedora/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000 --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg /usr/bin/crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005]" INFO[0010] Running conmon under slice user-libpod_pod_99eef75daa6bfb5e2ad615947d4831692efd1996470f740e84a28dcc5d924826.slice and unitName libpod-conmon-e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005.scope [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied WARN[0010] Failed to add conmon to systemd sandbox cgroup: Unit libpod-conmon-e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005.scope already exists. DEBU[0010] Received: 316608 INFO[0010] Got Conmon PID as 316604 DEBU[0010] Created container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 in OCI runtime DEBU[0010] Starting container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 with command [/bin/bash -c /init/start.sh] DEBU[0010] Started container e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 DEBU[0010] Worker#0 finished job [(*LocalRuntime) Restart func1]/e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 () e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005 ```

Very likely restarting certain images (e.q. calibre-web I refered in the issue) triggers this.

cognition9144 commented 4 years ago

@aleks-mariusz Could you confirm whether #5552 solves the issue?

Even after podman system migrate, the problem still exists.

WARN[0010] Failed to add conmon to systemd sandbox cgroup: Unit libpod-conmon-e83572eff474c4b2f54e4ccf71bc20ce949016aa0e65c6c1128cbf636c373005.scope already exists. might indicates the problem

Update: wiredly, if I run with --systemd=false, this warning disappears, but the problem persists.

rhatdan commented 4 years ago

The problem is most likely in iptables and CNI.

aleks-mariusz commented 4 years ago

@aleks-mariusz Could you confirm whether #5552 solves the issue?

tl;dr - didn't seem to? not sure if i tested correctly tho (see details below)


so yesterday evening i saw that 1.8.2 was just released, but dissappointingly it didn't seem to contain this PR merged so it was a bit of a burden to have to (cross-)compile on master (something i'd hope i could put behind me since i found a proper apt source for my distro/arch for podman that is fairly up to date (1.8.2 showed up in the packages). i get that you don't want to include it until confirmed fixed issue but if it passes test would have been good to include it anyway as it sounds like more proper handling of sigpipe.

so i found some time today just now to give this another test, just want to review my testing methodology first:

  1. run container (with no attempts to connect), container came up and port forwarding into container worked (and containers-rootlessport seen running).
  2. start app on another system that repeatedly connects to that containerized service from step 1
  3. podman restart the container..

as the container takes roughly 5 minutes to restart (hey it's a single core <1ghz proc and this is mainly python), the app from step two continues trying to connect even before the service being port-forwarded is available (confirmed with tcpdump and ton of SYN/RST back/forth traffic during container restart)

results? the process containers-rootlessport is not running when the service in the container finishes restarting and the port forwarding does not work.. the tcpdump output seems like a good indication the port forwarding will not bein place when container finishes starting, and confirmed with the missing process

anywya, after (several hours) i (finally) got podman compiled off master and my podman version on the test system is:

pi@raspberrypi:~ $ podman version
Version:            1.8.3-dev
RemoteAPI Version:  1
Go Version:         go1.14
Git Commit:         d927b43350a079c05f54e984838b851bcc2e5931-dirty
Built:              Fri Mar 20 02:36:18 2020
OS/Arch:            linux/arm

However, please note: i only swapped in the bin/podman and bin/podman-remote binaries onto my testing system, i am not sure if there's other part i didn't swap in from #5552 ?

aleks-mariusz commented 4 years ago

oh and p.s. if i restart the container and make sure no connection attempts to the forwarded port, the port forwarding does work after the container finishes restarting.. i confirm with tcpdump that no connection attempts are being made.. i'm glad i at least have a work-around i can use to ensure port-forwarding will work..

something about trying to connect to the port forwarded service before it's listening in the container is causing containers-rootlessport to die (so i had hopes that #5552 would have fixed it).

cognition9144 commented 4 years ago

Another interesting observation. Pods in a good condition have infrastructure container with empty MountLabel and empty ProcessLabel, whereas infrastructure containers in pods with networking errors are labeled (both of these two types of pods bind-mount some directories).

❯ sdiff good bad 
.....
        "MountLabel": "",                     |         "MountLabel": "system_u:object_r:container_file_t:s0:
        "ProcessLabel": "",                   |         "ProcessLabel": "system_u:system_r:container_t:s0:c18

According

rhatdan commented 4 years ago

The one's without SELinux labels, will run as unconfined. Do you see AVC messages when this happens? grep -i avc /var/log/audit/audit.log

cognition9144 commented 4 years ago

Yeah there are flood of AVC messages, although I've set selinux to permissive.

All of them look like this:

type=AVC msg=audit(1584870600.294:1370): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584870900.326:1388): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584871318.497:1409): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584871529.428:1422): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584873600.367:1503): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584873738.695:1515): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584874343.400:1547): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584874948.078:1575): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584875432.620:1598): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1584876763.478:1661): avc:  denied  { sendto } for  pid=1624 comm="auditd" path=007573657264622D3965626163386339636538363763646233313862626233626531326637663664 scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=unix_dgram_socket permissive=1
cognition9144 commented 4 years ago

Pretty sure selinux accounts for the problem (unix_dgram_socket). Running a unaccessable pod with generated systemd service automatically makes this pod accessable (due to different CGroup?). The AVCs also disappear.

I don't quite understand why selinux still blocks things in permissive mode. In addition, I have container-selinux-2:2.125.0-1.fc32.noarch.

cognition9144 commented 4 years ago

@rhatdan well it's not really related to #5552 AFAIK

rhatdan commented 4 years ago

These AVC's are about the auditd daemon sending packets? Is your container running as auditd_t?

cognition9144 commented 4 years ago

These AVC's are about the auditd daemon sending packets? Is your container running as auditd_t?

Well at least I didn't set that argument. But yes all of the unaccessible containers for some reasons have that label. Maybe it's a bug of containers-selinux?

aleks-mariusz commented 4 years ago

May i ask to review whether this issue should have been closed?

As I had mentioned earlier, #5552 did not seem to address this.. i've worked around it by making sure no connections are being made to the container while it's still starting (on my slow system that takes several minutes so is a wide window for a stray connection to cause rootlessport to end up terminating? @xcffl i tried to go back through your notes but see it went off on a selinux tangent (did you mean to imply this issue is related to that), could you comment if #5552 had an effect for you at least?

mheon commented 4 years ago

Hm. If #5552 did not resolve this, I can reopen.

cognition9144 commented 4 years ago

@aleks-mariusz well honestly #5552 has no correlation to my issue AFAIK. My issue I think is just because of some random SELinux bugs that container-selinux didn't take care of. Normally I would expect the containers to be labeled but not be blocked. However the observation is only non-labeled containers (which means they are not managed by SELinux) works.

Do you see SELinux complaints as well?

Currently my workaround is to let systemd to manage it. Magically it works. Probably related to different CGroup management to conmon as I addressed in #5605.

github-actions[bot] commented 4 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 4 years ago

@mheon @xcffl Has this issue been resolved?

mheon commented 4 years ago

I don't really know - this is AFAIK an issue with rootlesskit. I haven't seen bugs to this effect in a few weeks, which is encouraging, but I was never able to reproduce myself so I can't confirm.

aleks-mariusz commented 4 years ago

FWIW, i am now also unable to reproduce on my original host, today running podman 1.9.0 so either it's been fixed or worked-around.

the test case was basically when podman run -d took enough time (on the order of minutes) before returning to the prompt.. which left enough time for an errant client to try to connect while networking was still being set up... such that any connection attempted prior to returning is what triggered the port to not listen once the container finally did launch

bottom line my raspberry pi is as slow as ever :-) (returning from podman run -d takes well over two minutes, but that's for unrelated reasons (using iscsi and a 100mbps interface hanging off a usb-2 bus), so it's a big enough window), and i can no longer reproduce with 1.9.0 when trying to establish a connection during that time

rhatdan commented 4 years ago

Ok reopen if it happens again.