containers / podman-compose

a script to run docker-compose.yml using podman
GNU General Public License v2.0
5.09k stars 483 forks source link

podman-compose down results in panic: runtime error: invalid memory address or nil pointer dereference #616

Open jukefr opened 1 year ago

jukefr commented 1 year ago

this is a repost from https://github.com/containers/podman/issues/17069 turns out it might be compose specific

Issue Description

Trying to use traefik with podman (more specifically podman-compose) not sure what specifically is the issue or if it is even related to traefik.

Here is what the compose file looks like spoiler ```yaml version: "3.3" services: # IMPORTANT # Run commands with keep-id to make volume permissions correct and all truly rootless # podman-compose --podman-run-args="--userns=keep-id" [...] # # Forward traffic to right port with # iptables -A PREROUTING -t nat -p tcp --dport 80 -j REDIRECT --to-port 1024 # iptables -A OUTPUT -t nat -p tcp --dport 80 -j REDIRECT --to-port 1024 ########################################################################### # PROXY ########################################################################### traefik: user: "1000:1001" image: "docker.io/library/traefik" labels: - "io.containers.autoupdate=registry" restart: always command: #- "--log.level=DEBUG" - "--api.insecure=true" - "--providers.docker=true" - "--providers.docker.exposedbydefault=false" - "--entrypoints.web.address=:1024" # HTTP - "--entrypoints.ssh.address=:1025" # GIT SSH ports: - "1024:1024" - "1025:1025" - "1026:8080" volumes: - /etc/timezone:/etc/timezone:ro - /etc/localtime:/etc/localtime:ro - "/run/user/1000/podman/podman.sock:/var/run/docker.sock:ro" # NOTE # Sometimes when shutting down the rootlessport process will hang for some reason # sudo lsof -i -P -n | grep $port # sudo kill $process_number # whoami: # user: "1000:1001" # image: "docker.io/traefik/whoami" # labels: # - "io.containers.autoupdate=registry" # - "traefik.enable=true" # - "traefik.http.routers.whoami.rule=Host(`whoami.localhost`)" # - "traefik.http.routers.whoami.entrypoints=web" # - "traefik.http.services.whoami-juke.loadbalancer.server.port=1024" # command: # - "--port=1024" # restart: always ########################################################################### # NEXTCLOUD ########################################################################### # user # password # database # cloud.localhost nextcloud_database: user: "1000:1001" image: "docker.io/library/postgres:alpine" labels: - "io.containers.autoupdate=registry" restart: always volumes: - /etc/timezone:/etc/timezone:ro - /etc/localtime:/etc/localtime:ro - ./resources/postgres_alpine_passwd:/etc/passwd:ro - ./volumes/nextcloud_database:/var/lib/postgresql/data:Z environment: - POSTGRES_DB=database - POSTGRES_USER=user - POSTGRES_PASSWORD=password nextcloud_server: user: "1000:1001" depends_on: - traefik - nextcloud_database image: "docker.io/library/nextcloud" labels: - "io.containers.autoupdate=registry" - "traefik.enable=true" - "traefik.http.routers.nextcloud_server.rule=Host(`cloud.localhost`)" - "traefik.http.routers.nextcloud_server.entrypoints=web" - "traefik.http.services.nextcloud_server-juke.loadbalancer.server.port=1024" restart: always volumes: - /etc/timezone:/etc/timezone:ro - /etc/localtime:/etc/localtime:ro - ./resources/nextcloud_server_passwd:/etc/passwd:ro - ./resources/nextcloud_server_ports.conf:/etc/apache2/ports.conf:ro - ./volumes/nextcloud_server:/var/www/html:Z hostname: cloud.localhost environment: - POSTGRES_PASSWORD=password - POSTGRES_DB=database - POSTGRES_USER=user - POSTGRES_HOST=nextcloud_database - NEXTCLOUD_TRUSTED_DOMAINS=cloud.localhost [...] ```

Everything seems to work fine when I

podman-compose --podman-run-args="--userns=keep-id" up -d

However when I

podman-compose --podman-run-args="--userns=keep-id" down -v

I get the following error

['podman', '--version', '']
using podman version: 4.3.1
** excluding:  set()
podman stop -t 10 juke_uptime_kuma_server_1
juke_uptime_kuma_server_1
exit code: 0
podman stop -t 10 juke_gitea_server_1
juke_gitea_server_1
exit code: 0
podman stop -t 10 juke_nextcloud_server_1
juke_nextcloud_server_1
exit code: 0
podman stop -t 10 juke_element_server_1
juke_element_server_1
exit code: 0
podman stop -t 10 juke_gitea_database_1
juke_gitea_database_1
exit code: 0
podman stop -t 10 juke_nextcloud_database_1
juke_nextcloud_database_1
exit code: 0
podman stop -t 10 juke_traefik_1
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5564a2c4495b]

goroutine 50 [running]:
os.(*File).Name(...)
    os/file.go:56
github.com/containers/podman/v4/pkg/errorhandling.CloseQuiet(0xc0002bc9a0?)
    github.com/containers/podman/v4/pkg/errorhandling/errorhandling.go:74 +0x5b
github.com/containers/podman/v4/libpod.(*Runtime).setupRootlessPortMappingViaRLK(0xc00027e380, 0xc0003d3b00, {0xc00060f700, 0x3f}, 0xc000604f01?)
    github.com/containers/podman/v4/libpod/networking_slirp4netns.go:581 +0x105d
github.com/containers/podman/v4/libpod.(*Container).setupRootlessNetwork(0xc0003d3b00)
    github.com/containers/podman/v4/libpod/container_internal_linux.go:414 +0x13c
github.com/containers/podman/v4/libpod.(*Container).handleRestartPolicy(0xc0003d3b00, {0x5564a3cc7510, 0xc0000460d0})
    github.com/containers/podman/v4/libpod/container_internal.go:296 +0x445
github.com/containers/podman/v4/libpod.(*Container).Cleanup(0xc0003d3b00, {0x5564a3cc7510, 0xc0000460d0})
    github.com/containers/podman/v4/libpod/container_api.go:726 +0x3dd
github.com/containers/podman/v4/pkg/domain/infra/abi.(*ContainerEngine).ContainerStop.func1(0xc0003d3b00)
    github.com/containers/podman/v4/pkg/domain/infra/abi/containers.go:248 +0x24e
github.com/containers/podman/v4/pkg/parallel/ctr.ContainerOp.func1()
    github.com/containers/podman/v4/pkg/parallel/ctr/ctr.go:28 +0x22
github.com/containers/podman/v4/pkg/parallel.Enqueue.func1()
    github.com/containers/podman/v4/pkg/parallel/parallel.go:67 +0x1ac
created by github.com/containers/podman/v4/pkg/parallel.Enqueue
    github.com/containers/podman/v4/pkg/parallel/parallel.go:56 +0xbe
exit code: 2
podman rm juke_uptime_kuma_server_1
juke_uptime_kuma_server_1
exit code: 0
podman rm juke_gitea_server_1
juke_gitea_server_1
exit code: 0
podman rm juke_nextcloud_server_1
juke_nextcloud_server_1
exit code: 0
podman rm juke_element_server_1
juke_element_server_1
exit code: 0
podman rm juke_gitea_database_1
juke_gitea_database_1
exit code: 0
podman rm juke_nextcloud_database_1
juke_nextcloud_database_1
exit code: 0
podman rm juke_traefik_1
ERRO[0000] Unable to clean up network for container 56582b9b79c1eaf581d92fde3ae67d1e02e2ab8b894cd83c708f6c0820ad4bfc: "unmounting network namespace for container 56582b9b79c1eaf581d92fde3ae67d1e02e2ab8b894cd83c708f6c0820ad4bfc: failed to remove ns path /run/user/1000/netns/netns-8528016b-18be-bf00-27e5-7c4f38f8fe90: remove /run/user/1000/netns/netns-8528016b-18be-bf00-27e5-7c4f38f8fe90: device or resource busy" 
juke_traefik_1
exit code: 0
['podman', 'volume', 'inspect', '--all']

Sometimes it will look like this however:

[user@demovm juke]$ podman-compose --podman-run-args="--userns=keep-id" down -v
['podman', '--version', '']
using podman version: 4.3.1
** excluding:  set()
podman stop -t 10 juke_uptime_kuma_server_1
juke_uptime_kuma_server_1
exit code: 0
podman stop -t 10 juke_gitea_server_1
juke_gitea_server_1
exit code: 0
podman stop -t 10 juke_nextcloud_server_1
juke_nextcloud_server_1
exit code: 0
podman stop -t 10 juke_gitea_database_1
juke_gitea_database_1
exit code: 0
podman stop -t 10 juke_nextcloud_database_1
juke_nextcloud_database_1
exit code: 0
podman stop -t 10 juke_traefik_1
ERRO[0001] Unable to clean up network for container 9520a833f7e2230bfa4eafe785d0320efb80f593bdeaf8e62ab78424dbd7e8b3: "unmounting network namespace for container 9520a833f7e2230bfa4eafe785d0320efb80f593bdeaf8e62ab78424dbd7e8b3: failed to remove ns path /run/user/1000/netns/netns-ac16b2fe-694e-4255-bbc6-5810f553715c: remove /run/user/1000/netns/netns-ac16b2fe-694e-4255-bbc6-5810f553715c: device or resource busy" 
juke_traefik_1
exit code: 0
podman rm juke_uptime_kuma_server_1
juke_uptime_kuma_server_1
exit code: 0
podman rm juke_gitea_server_1
juke_gitea_server_1
exit code: 0
podman rm juke_nextcloud_server_1
juke_nextcloud_server_1
exit code: 0
podman rm juke_gitea_database_1
juke_gitea_database_1
exit code: 0
podman rm juke_nextcloud_database_1
juke_nextcloud_database_1
exit code: 0
podman rm juke_traefik_1
juke_traefik_1
exit code: 0
['podman', 'volume', 'inspect', '--all']

At this point if I lsof I see a process that I can kill

 ~  sudo lsof -i -P -n | grep 1024
rootlessp 36533  user   10u  IPv6 137699      0t0  TCP *:1024 (LISTEN)
 ~  sudo kill 36533

But doing so still apparently leaves the system thinking that IP addresses are allocated when they shouldnt be because trying to spin up the services again with

podman-compose --podman-run-args="--userns=keep-id" up -d

results in the following

['podman', '--version', '']
using podman version: 4.3.1
** excluding:  set()
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_traefik_1 -d --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=traefik -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /run/user/1000/podman/podman.sock:/var/run/docker.sock:ro --net juke_default --network-alias traefik -p 1024:1024 -p 1025:1025 -p 1026:8080 -u 1000:1001 --restart always docker.io/library/traefik --api.insecure=true --providers.docker=true --providers.docker.exposedbydefault=false --entrypoints.web.address=:1024 --entrypoints.ssh.address=:1025
9e257843121f1294bfbeec12548cf198c93f8cb032f491d1aec79d7843117b7d
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_nextcloud_database_1 -d --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=nextcloud_database -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/Documents/juke/resources/postgres_alpine_passwd:/etc/passwd:ro -v /home/user/Documents/juke/volumes/nextcloud_database:/var/lib/postgresql/data:Z --net juke_default --network-alias nextcloud_database -u 1000:1001 --restart always docker.io/library/postgres:alpine
f0b6ec0f62e2d492a404ffc90f5ce0ff5a31aaa7fb2e139e79caf4399f177e0d
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_gitea_database_1 -d --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=gitea_database -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/Documents/juke/resources/postgres_alpine_passwd:/etc/passwd:ro -v /home/user/Documents/juke/volumes/gitea_database:/var/lib/postgresql/data:Z --net juke_default --network-alias gitea_database -u 1000:1001 --restart always docker.io/library/postgres:alpine
72354b4dcf75071a511f109e98ea954954e83622a9881a14a6aa29224caf0810
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_element_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.element_server.rule=Host(`chat.localhost`) --label traefik.http.routers.element_server.entrypoints=web --label traefik.http.services.element_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=element_server -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/Documents/juke/resources/element_server_passwd:/etc/passwd:ro -v /home/user/Documents/juke/resources/element_server_nginx.conf:/etc/nginx/conf.d/default.conf:ro -v /home/user/Documents/juke/resources/element_server_config.json:/app/config.json:ro --net juke_default --network-alias element_server -u 1000:1001 --restart always docker.io/vectorim/element-web
7e074a3a9ae23003fdfa6642a48e05507585f6b1fcf45f3e92c91e543ced7e45
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_nextcloud_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.nextcloud_server.rule=Host(`cloud.localhost`) --label traefik.http.routers.nextcloud_server.entrypoints=web --label traefik.http.services.nextcloud_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=nextcloud_server -e POSTGRES_PASSWORD=password -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_HOST=nextcloud_database -e NEXTCLOUD_TRUSTED_DOMAINS=cloud.localhost -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/Documents/juke/resources/nextcloud_server_passwd:/etc/passwd:ro -v /home/user/Documents/juke/resources/nextcloud_server_ports.conf:/etc/apache2/ports.conf:ro -v /home/user/Documents/juke/volumes/nextcloud_server:/var/www/html:Z --net juke_default --network-alias nextcloud_server -u 1000:1001 --hostname cloud.localhost --restart always docker.io/library/nextcloud
687ad8936a6aa6427bcdccd1d02c05acb2a0733229b564ebadcbe1cb0ee0bc5f
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_gitea_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.gitea_server.rule=Host(`code.localhost`) --label traefik.http.routers.gitea_server.entrypoints=web --label traefik.http.services.gitea_server-juke.loadbalancer.server.port=1024 --label traefik.tcp.routers.gitea_server_ssh.rule=HostSNI(`*`) --label traefik.tcp.routers.gitea_server_ssh.entrypoints=ssh --label traefik.tcp.services.girea_server_ssh-juke.loadbalancer.server.port=1025 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=gitea_server -e HTTP_PORT=1024 -e DEFAULT_BRANCH=main -e RUN_MODE=prod -e DISABLE_SSH=false -e START_SSH_SERVER=true -e SSH_PORT=1025 -e SSH_LISTEN_PORT=1025 -e ROOT_URL=http://code.localhost -e GITEA__database__DB_TYPE=postgres -e GITEA__database__HOST=gitea_database:5432 -e GITEA__database__NAME=database -e GITEA__database__USER=user -e GITEA__database__PASSWD=password -e GITEA__service__DISABLE_REGISTRATION=true -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/Documents/juke/resources/gitea_server_passwd:/etc/passwd:ro -v /home/user/Documents/juke/volumes/gitea_server:/data:Z --net juke_default --network-alias gitea_server -u 1000:1001 --restart always docker.io/gitea/gitea:latest-rootless
ERRO[0000] IPAM error: failed to get ips for container ID 49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024 on network juke_default 
[ERROR netavark::network::bridge] failed to parse ipam options: no static ips provided
ERRO[0000] IPAM error: failed to find ip for subnet 10.89.0.0/24 on network juke_default 
ERRO[0000] Unable to clean up network for container 49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024: "tearing down network namespace configuration for container 49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024: netavark: failed to delete container veth eth0: Netlink error: No such device (os error 19)" 
Error: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 126
podman start juke_gitea_server_1
ERRO[0000] IPAM error: failed to get ips for container ID 49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024 on network juke_default 
[ERROR netavark::network::bridge] failed to parse ipam options: no static ips provided
ERRO[0000] IPAM error: failed to find ip for subnet 10.89.0.0/24 on network juke_default 
ERRO[0000] Unable to clean up network for container 49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024: "tearing down network namespace configuration for container 49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024: netavark: failed to delete container veth eth0: Netlink error: No such device (os error 19)" 
Error: unable to start container "49493f4c0bc56fe1bc48ab5cdc113ca5d56a5fb369835d144e0a413b041e5024": IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 125
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_uptime_kuma_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.uptime_kuma_server.rule=Host(`status.localhost`) --label traefik.http.routers.uptime_kuma_server.entrypoints=web --label traefik.http.services.uptime_kuma_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/Documents/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=uptime_kuma_server -e PUID=1000 -e PGID=1001 -e PORT=1024 -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/Documents/juke/resources/uptime_kuma_server_passwd:/etc/passwd:ro -v /home/user/Documents/juke/volumes/uptime_kuma_server:/app/data:Z --net juke_default --network-alias uptime_kuma_server -u 1000:1001 --restart always --entrypoint ["node", "/app/server/server.js"] docker.io/louislam/uptime-kuma
ERRO[0000] IPAM error: failed to get ips for container ID 1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6 on network juke_default 
[ERROR netavark::network::bridge] failed to parse ipam options: no static ips provided
ERRO[0000] IPAM error: failed to find ip for subnet 10.89.0.0/24 on network juke_default 
ERRO[0000] Unable to clean up network for container 1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6: "tearing down network namespace configuration for container 1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6: netavark: failed to delete container veth eth0: Netlink error: No such device (os error 19)" 
Error: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 126
podman start juke_uptime_kuma_server_1
ERRO[0000] IPAM error: failed to get ips for container ID 1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6 on network juke_default 
[ERROR netavark::network::bridge] failed to parse ipam options: no static ips provided
ERRO[0000] IPAM error: failed to find ip for subnet 10.89.0.0/24 on network juke_default 
ERRO[0000] Unable to clean up network for container 1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6: "tearing down network namespace configuration for container 1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6: netavark: failed to delete container veth eth0: Netlink error: No such device (os error 19)" 
Error: unable to start container "1c7c25f5e7c6816dc644b93402010e5c85bbed88d38e74b97f5f347e125a2ab6": IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 125

Saying that its failing to find any free IPs

Steps to reproduce the issue

Steps to reproduce the issue

    0  clean arch install inside of a VM (endeavouros in this specific case)
    1  yay -S podman podman-compose cni-plugins aardvark-dns wget
    3  systemctl --user enable --now podman
    4  wget "https://github.com/containers/podman/files/10389570/files.zip"
    5  unzip files.zip 
    6  mv files juke 
    7  cd juke
    9  ./create_dirs.sh 
   10  podman-compose --podman-run-args="--userns=keep-id" up

press CTRL-C, error happens

alteratively, start the services in daemon mode up -d and destroy them and their volumes in another step down -v, same error happens

Describe the results you received

Stopping is not clean and not leaving hung processes and IP addresses stay unavailable. The only way I found to fix it properly is to reboot the entire host.

Describe the results you expected

Not having hung processes that make it impossible to restart the pods because no more IPs are available and needing to reboot to get it to work.

podman info output

podman information output log spoiler ```shell ~ podman version Client: Podman Engine Version: 4.3.1 API Version: 4.3.1 Go Version: go1.19.3 Git Commit: 814b7b003cc630bf6ab188274706c383f9fb9915-dirty Built: Sun Nov 20 23:32:45 2022 OS/Arch: linux/amd64 ~ podman info host: arch: amd64 buildahVersion: 1.28.0 cgroupControllers: - cpu - memory - pids cgroupManager: systemd cgroupVersion: v2 conmon: package: /usr/bin/conmon is owned by conmon 1:2.1.5-1 path: /usr/bin/conmon version: 'conmon version 2.1.5, commit: c9f7f19eb82d5b8151fc3ba7fbbccf03fdcd0325' cpuUtilization: idlePercent: 90.28 systemPercent: 1.51 userPercent: 8.2 cpus: 8 distribution: distribution: endeavouros version: unknown eventLogger: journald hostname: user-standardpcq35ich92009 idMappings: gidmap: - container_id: 0 host_id: 1001 size: 1 - container_id: 1 host_id: 100000 size: 65536 uidmap: - container_id: 0 host_id: 1000 size: 1 - container_id: 1 host_id: 100000 size: 65536 kernel: 6.1.4-arch1-1 linkmode: dynamic logDriver: journald memFree: 4336590848 memTotal: 8333340672 networkBackend: netavark ociRuntime: name: crun package: /usr/bin/crun is owned by crun 1.7.2-1 path: /usr/bin/crun version: |- crun version 1.7.2 commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4 rundir: /run/user/1000/crun spec: 1.0.0 +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL os: linux remoteSocket: exists: true path: /run/user/1000/podman/podman.sock security: apparmorEnabled: false capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT rootless: true seccompEnabled: true seccompProfilePath: /etc/containers/seccomp.json selinuxEnabled: false serviceIsRemote: false slirp4netns: executable: /usr/bin/slirp4netns package: /usr/bin/slirp4netns is owned by slirp4netns 1.2.0-1 version: |- slirp4netns version 1.2.0 commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383 libslirp: 4.7.0 SLIRP_CONFIG_VERSION_MAX: 4 libseccomp: 2.5.4 swapFree: 0 swapTotal: 0 uptime: 0h 20m 17.00s plugins: authorization: null log: - k8s-file - none - passthrough - journald network: - bridge - macvlan volume: - local registries: {} store: configFile: /home/user/.config/containers/storage.conf containerStore: number: 0 paused: 0 running: 0 stopped: 0 graphDriverName: overlay graphOptions: {} graphRoot: /home/user/.local/share/containers/storage graphRootAllocated: 31523282944 graphRootUsed: 12035514368 graphStatus: Backing Filesystem: extfs Native Overlay Diff: "true" Supports d_type: "true" Using metacopy: "false" imageCopyTmpDir: /var/tmp imageStore: number: 6 runRoot: /run/user/1000/containers volumePath: /home/user/.local/share/containers/storage/volumes version: APIVersion: 4.3.1 Built: 1668983565 BuiltTime: Sun Nov 20 23:32:45 2022 GitCommit: 814b7b003cc630bf6ab188274706c383f9fb9915-dirty GoVersion: go1.19.3 Os: linux OsArch: linux/amd64 Version: 4.3.1 ```

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Happens both locally and inside a fresh VM.

Additional information

Ask if anything unclear None of the info in here is sensitive and mostly placeholders for password and such not to worry


Speaking of the fresh VM i mention in my latest reply theres another variant of this issue i forgot to mention

  1. I just rebooted the VM
  2. start services goes fine
[user@demovm juke]$ podman-compose --podman-run-args="--userns=keep-id" up -d
['podman', '--version', '']
using podman version: 4.3.1
** excluding:  set()
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_traefik_1 -d --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=traefik -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /run/user/1000/podman/podman.sock:/var/run/docker.sock:ro --net juke_default --network-alias traefik -p 1024:1024 -p 1025:1025 -p 1026:8080 -u 1000:1001 --restart always docker.io/library/traefik --api.insecure=true --providers.docker=true --providers.docker.exposedbydefault=false --entrypoints.web.address=:1024 --entrypoints.ssh.address=:1025
9520a833f7e2230bfa4eafe785d0320efb80f593bdeaf8e62ab78424dbd7e8b3
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_nextcloud_database_1 -d --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=nextcloud_database -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/postgres_alpine_passwd:/etc/passwd:ro -v /home/user/juke/volumes/nextcloud_database:/var/lib/postgresql/data:Z --net juke_default --network-alias nextcloud_database -u 1000:1001 --restart always docker.io/library/postgres:alpine
37883aa11e7700e5b38232f4dd6660d1c0133d7fca0bc2a5f4394c40ed96e924
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_gitea_database_1 -d --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=gitea_database -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/postgres_alpine_passwd:/etc/passwd:ro -v /home/user/juke/volumes/gitea_database:/var/lib/postgresql/data:Z --net juke_default --network-alias gitea_database -u 1000:1001 --restart always docker.io/library/postgres:alpine
caa368d7fa8a3d5c6f806210f95c86b4efa6d893f288e23fdedc9c883350af6d
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_nextcloud_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.nextcloud_server.rule=Host(`cloud.localhost`) --label traefik.http.routers.nextcloud_server.entrypoints=web --label traefik.http.services.nextcloud_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=nextcloud_server -e POSTGRES_PASSWORD=password -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_HOST=nextcloud_database -e NEXTCLOUD_TRUSTED_DOMAINS=cloud.localhost -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/nextcloud_server_passwd:/etc/passwd:ro -v /home/user/juke/resources/nextcloud_server_ports.conf:/etc/apache2/ports.conf:ro -v /home/user/juke/volumes/nextcloud_server:/var/www/html:Z --net juke_default --network-alias nextcloud_server -u 1000:1001 --hostname cloud.localhost --restart always docker.io/library/nextcloud
5e43479147278511fcf530f5962a20fbad4cd05ae13249d341f78a8e86a8b876
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_gitea_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.gitea_server.rule=Host(`code.localhost`) --label traefik.http.routers.gitea_server.entrypoints=web --label traefik.http.services.gitea_server-juke.loadbalancer.server.port=1024 --label traefik.tcp.routers.gitea_server_ssh.rule=HostSNI(`*`) --label traefik.tcp.routers.gitea_server_ssh.entrypoints=ssh --label traefik.tcp.services.girea_server_ssh-juke.loadbalancer.server.port=1025 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=gitea_server -e HTTP_PORT=1024 -e DEFAULT_BRANCH=main -e RUN_MODE=prod -e DISABLE_SSH=false -e START_SSH_SERVER=true -e SSH_PORT=1025 -e SSH_LISTEN_PORT=1025 -e ROOT_URL=http://code.localhost -e GITEA__database__DB_TYPE=postgres -e GITEA__database__HOST=gitea_database:5432 -e GITEA__database__NAME=database -e GITEA__database__USER=user -e GITEA__database__PASSWD=password -e GITEA__service__DISABLE_REGISTRATION=true -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/gitea_server_passwd:/etc/passwd:ro -v /home/user/juke/volumes/gitea_server:/data:Z --net juke_default --network-alias gitea_server -u 1000:1001 --restart always docker.io/gitea/gitea:latest-rootless
07ccf42e939421663f64e2ecbe31598d6a884c26e8faad9a02959a4a09eaae33
exit code: 0
['podman', 'network', 'exists', 'juke_default']
podman run --userns=keep-id --name=juke_uptime_kuma_server_1 -d --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.uptime_kuma_server.rule=Host(`status.localhost`) --label traefik.http.routers.uptime_kuma_server.entrypoints=web --label traefik.http.services.uptime_kuma_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=uptime_kuma_server -e PUID=1000 -e PGID=1001 -e PORT=1024 -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/uptime_kuma_server_passwd:/etc/passwd:ro -v /home/user/juke/volumes/uptime_kuma_server:/app/data:Z --net juke_default --network-alias uptime_kuma_server -u 1000:1001 --restart always --entrypoint ["node", "/app/server/server.js"] docker.io/louislam/uptime-kuma
725f56c4f78a94c7a39547567e7e4bd87475bf018080e9c04137e1c8c98c0333
exit code: 0
  1. try to stop them
    [user@demovm juke]$ podman-compose --podman-run-args="--userns=keep-id" down -v
    ['podman', '--version', '']
    using podman version: 4.3.1
    ** excluding:  set()
    podman stop -t 10 juke_uptime_kuma_server_1
    juke_uptime_kuma_server_1
    exit code: 0
    podman stop -t 10 juke_gitea_server_1
    juke_gitea_server_1
    exit code: 0
    podman stop -t 10 juke_nextcloud_server_1
    juke_nextcloud_server_1
    exit code: 0
    podman stop -t 10 juke_gitea_database_1
    juke_gitea_database_1
    exit code: 0
    podman stop -t 10 juke_nextcloud_database_1
    juke_nextcloud_database_1
    exit code: 0
    podman stop -t 10 juke_traefik_1
    ERRO[0001] Unable to clean up network for container 9520a833f7e2230bfa4eafe785d0320efb80f593bdeaf8e62ab78424dbd7e8b3: "unmounting network namespace for container 9520a833f7e2230bfa4eafe785d0320efb80f593bdeaf8e62ab78424dbd7e8b3: failed to remove ns path /run/user/1000/netns/netns-ac16b2fe-694e-4255-bbc6-5810f553715c: remove /run/user/1000/netns/netns-ac16b2fe-694e-4255-bbc6-5810f553715c: device or resource busy" 
    juke_traefik_1
    exit code: 0
    podman rm juke_uptime_kuma_server_1
    juke_uptime_kuma_server_1
    exit code: 0
    podman rm juke_gitea_server_1
    juke_gitea_server_1
    exit code: 0
    podman rm juke_nextcloud_server_1
    juke_nextcloud_server_1
    exit code: 0
    podman rm juke_gitea_database_1
    juke_gitea_database_1
    exit code: 0
    podman rm juke_nextcloud_database_1
    juke_nextcloud_database_1
    exit code: 0
    podman rm juke_traefik_1
    juke_traefik_1
    exit code: 0
    ['podman', 'volume', 'inspect', '--all']

no clue what causes one issue or the other (the one in the first post with the panic) to happen or why there seems to be two different possible outcomes

updated my original post to include the other error message


Okay still on the VM, after a fresh reboot again,

If I start the services with

podman-compose --podman-run-args="--userns=keep-id" up

and force a graceful shutdown with a single CTRL-C it seems to go through (sometimes it will error, sometimes it seems to go through, I have no clue why)

^CTraceback (most recent call last):
  File "/usr/bin/podman-compose", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.10/site-packages/podman_compose.py", line 1775, in main
    podman_compose.run()
  File "/usr/lib/python3.10/site-packages/podman_compose.py", line 1024, in run
    cmd(self, args)
  File "/usr/lib/python3.10/site-packages/podman_compose.py", line 1248, in wrapped
    return func(*args, **kw)
  File "/usr/lib/python3.10/site-packages/podman_compose.py", line 1442, in compose_up
    thread.join(timeout=1.0)
  File "/usr/lib/python3.10/threading.py", line 1100, in join
    self._wait_for_tstate_lock(timeout=max(timeout, 0))
  File "/usr/lib/python3.10/threading.py", line 1116, in _wait_for_tstate_lock
    if lock.acquire(block, timeout):
KeyboardInterrupt
2023-01-11T15:06:17Z [SERVER] INFO: Shutdown requested
2023-01-11T15:06:17Z [SERVER] INFO: Called signal: SIGINT
2023-01-11T15:06:17Z [SERVER] INFO: Stopping all monitors

[user@demovm juke]$ 2023/01/11 16:06:17 ...eful/manager_unix.go:147:handleSignals() [W] [63bed03e-4] PID 2. Received SIGINT. Shutting down...
2023/01/11 16:06:17 cmd/web.go:271:listen() [I] [63bed03e-6] HTTP Listener: 0.0.0.0:1024 Closed
2023/01/11 16:06:17 ...eful/server_hooks.go:47:doShutdown() [I] [63bed03e-6] PID: 2 Listener ([::]:1024) closed.
2023/01/11 16:06:17 .../graceful/manager.go:206:doHammerTime() [W] Setting Hammer condition
2023-01-11 16:06:17.434 CET [1] LOG:  received fast shutdown request
2023-01-11 16:06:17.442 CET [1] LOG:  aborting any active transactions
2023-01-11 16:06:17.444 CET [1] LOG:  background worker "logical replication launcher" (PID 14) exited with exit code 1
2023-01-11 16:06:17.444 CET [9] LOG:  shutting down
[Wed Jan 11 16:06:17.449373 2023] [mpm_prefork:notice] [pid 1] AH00169: caught SIGTERM, shutting down
2023-01-11 16:06:17.450 CET [9] LOG:  checkpoint starting: shutdown immediate
2023-01-11 16:06:17.513 CET [9] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.015 s, sync=0.006 s, total=0.070 s; sync files=2, longest=0.004 s, average=0.003 s; distance=0 kB, estimate=0 kB
2023-01-11 16:06:17.516 CET [1] LOG:  database system is shut down
time="2023-01-11T16:06:17+01:00" level=error msg="accept tcp [::]:8080: use of closed network connection" entryPointName=traefik
time="2023-01-11T16:06:17+01:00" level=error msg="accept tcp [::]:1024: use of closed network connection" entryPointName=web
time="2023-01-11T16:06:17+01:00" level=error msg="accept tcp [::]:1025: use of closed network connection" entryPointName=ssh
time="2023-01-11T16:06:17+01:00" level=error msg="close tcp [::]:8080: use of closed network connection" entryPointName=traefik
time="2023-01-11T16:06:17+01:00" level=error msg="close tcp [::]:1025: use of closed network connection" entryPointName=ssh
time="2023-01-11T16:06:17+01:00" level=error msg="close tcp [::]:1024: use of closed network connection" entryPointName=web
2023-01-11 16:06:17.734 CET [1] LOG:  received fast shutdown request
2023-01-11 16:06:17.745 CET [1] LOG:  aborting any active transactions
2023-01-11 16:06:17.746 CET [1] LOG:  background worker "logical replication launcher" (PID 14) exited with exit code 1
2023-01-11 16:06:17.746 CET [9] LOG:  shutting down
2023-01-11 16:06:17.755 CET [9] LOG:  checkpoint starting: shutdown immediate
2023-01-11 16:06:17.830 CET [9] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.018 s, sync=0.010 s, total=0.084 s; sync files=2, longest=0.006 s, average=0.005 s; distance=0 kB, estimate=0 kB
2023-01-11 16:06:17.833 CET [1] LOG:  database system is shut down
2023/01/11 16:06:18 .../graceful/manager.go:225:doTerminate() [W] Terminating
2023/01/11 16:06:18 ...eful/manager_unix.go:158:handleSignals() [W] PID: 2. Background context for manager closed - context canceled - Shutting down...
2023/01/11 16:06:18 cmd/web.go:140:runWeb() [I] PID: 2 Gitea Web Finished
2023-01-11T15:06:19Z [DB] INFO: Closing the database
2023-01-11T15:06:21Z [DB] INFO: SQLite closed
2023-01-11T15:06:21Z [CLOUDFLARED] INFO: Stop cloudflared
2023-01-11T15:06:21Z [SERVER] INFO: Graceful shutdown successful!

However when i check with lsof

[user@demovm juke]$ sudo lsof -i -P -n | grep 1024
rootlessp 3127  user   10u  IPv6  27132      0t0  TCP *:1024 (LISTEN)

The rootlessport process seems to still be there


Okay so after trying some more things,

Running all of the commands that podman-compose runs, but manually, with a restart policy, seems to always work (as in no error)

Here are the commands i used spoiler ```bash podman run --userns=keep-id --name=juke_traefik_1 --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=traefik -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /run/user/1000/podman/podman.sock:/var/run/docker.sock:ro --net juke_default --network-alias traefik -p 1024:1024 -p 1025:1025 -p 1026:8080 -u 1000:1001 --restart always docker.io/library/traefik --api.insecure=true --providers.docker=true --providers.docker.exposedbydefault=false --entrypoints.web.address=:1024 --entrypoints.ssh.address=:1025 &\ podman run --userns=keep-id --name=juke_nextcloud_database_1 --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=nextcloud_database -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/postgres_alpine_passwd:/etc/passwd:ro -v /home/user/juke/volumes/nextcloud_database:/var/lib/postgresql/data:Z --net juke_default --network-alias nextcloud_database -u 1000:1001 --restart always docker.io/library/postgres:alpine &\ podman run --userns=keep-id --name=juke_gitea_database_1 --label io.containers.autoupdate=registry --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=gitea_database -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/postgres_alpine_passwd:/etc/passwd:ro -v /home/user/juke/volumes/gitea_database:/var/lib/postgresql/data:Z --net juke_default --network-alias gitea_database -u 1000:1001 --restart always docker.io/library/postgres:alpine &\ podman run --userns=keep-id --name=juke_nextcloud_server_1 --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.nextcloud_server.rule="Host(\`cloud.localhost\`)" --label traefik.http.routers.nextcloud_server.entrypoints=web --label traefik.http.services.nextcloud_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=nextcloud_server -e POSTGRES_PASSWORD=password -e POSTGRES_DB=database -e POSTGRES_USER=user -e POSTGRES_HOST=nextcloud_database -e NEXTCLOUD_TRUSTED_DOMAINS=cloud.localhost -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/nextcloud_server_passwd:/etc/passwd:ro -v /home/user/juke/resources/nextcloud_server_ports.conf:/etc/apache2/ports.conf:ro -v /home/user/juke/volumes/nextcloud_server:/var/www/html:Z --net juke_default --network-alias nextcloud_server -u 1000:1001 --restart always --hostname cloud.localhost docker.io/library/nextcloud &\ podman run --userns=keep-id --name=juke_gitea_server_1 --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.gitea_server.rule="Host(\`code.localhost\`)" --label traefik.http.routers.gitea_server.entrypoints=web --label traefik.http.services.gitea_server-juke.loadbalancer.server.port=1024 --label traefik.tcp.routers.gitea_server_ssh.rule="HostSNI(\`*\`)" --label traefik.tcp.routers.gitea_server_ssh.entrypoints=ssh --label traefik.tcp.services.girea_server_ssh-juke.loadbalancer.server.port=1025 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=gitea_server -e HTTP_PORT=1024 -e DEFAULT_BRANCH=main -e RUN_MODE=prod -e DISABLE_SSH=false -e START_SSH_SERVER=true -e SSH_PORT=1025 -e SSH_LISTEN_PORT=1025 -e ROOT_URL=http://code.localhost -e GITEA__database__DB_TYPE=postgres -e GITEA__database__HOST=gitea_database:5432 -e GITEA__database__NAME=database -e GITEA__database__USER=user -e GITEA__database__PASSWD=password -e GITEA__service__DISABLE_REGISTRATION=true -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/gitea_server_passwd:/etc/passwd:ro -v /home/user/juke/volumes/gitea_server:/data:Z --net juke_default --network-alias gitea_server -u 1000:1001 --restart always docker.io/gitea/gitea:latest-rootless &\ podman run --userns=keep-id --name=juke_uptime_kuma_server_1 --label io.containers.autoupdate=registry --label traefik.enable=true --label traefik.http.routers.uptime_kuma_server.rule="Host(\`status.localhost\`)" --label traefik.http.routers.uptime_kuma_server.entrypoints=web --label traefik.http.services.uptime_kuma_server-juke.loadbalancer.server.port=1024 --label io.podman.compose.config-hash=123 --label io.podman.compose.project=juke --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=juke --label com.docker.compose.project.working_dir=/home/user/juke --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=uptime_kuma_server -e PUID=1000 -e PGID=1001 -e PORT=1024 -v /etc/timezone:/etc/timezone:ro -v /usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro -v /home/user/juke/resources/uptime_kuma_server_passwd:/etc/passwd:ro -v /home/user/juke/volumes/uptime_kuma_server:/app/data:Z --net juke_default --network-alias uptime_kuma_server -u 1000:1001 --restart always --entrypoint '["node", "/app/server/server.js"]' docker.io/louislam/uptime-kuma & podman stop -t 10 juke_uptime_kuma_server_1 podman stop -t 10 juke_gitea_server_1 podman stop -t 10 juke_nextcloud_server_1 podman stop -t 10 juke_gitea_database_1 podman stop -t 10 juke_nextcloud_database_1 podman stop -t 10 juke_traefik_1 podman rm juke_uptime_kuma_server_1 podman rm juke_gitea_server_1 podman rm juke_nextcloud_server_1 podman rm juke_gitea_database_1 podman rm juke_nextcloud_database_1 podman rm juke_traefik_1 ```

However as soon as I do a

podman-compose --podman-run-args="--userns=keep-id" up
CTRL-C
podman-compose --podman-run-args="--userns=keep-id" down -v
['podman', '--version', '']
using podman version: 4.3.1
** excluding:  set()
podman stop -t 10 juke_uptime_kuma_server_1
juke_uptime_kuma_server_1
exit code: 0
podman stop -t 10 juke_gitea_server_1
juke_gitea_server_1
exit code: 0
podman stop -t 10 juke_nextcloud_server_1
juke_nextcloud_server_1
exit code: 0
podman stop -t 10 juke_gitea_database_1
juke_gitea_database_1
exit code: 0
podman stop -t 10 juke_nextcloud_database_1
juke_nextcloud_database_1
exit code: 0
podman stop -t 10 juke_traefik_1
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x55c1faa0695b]

goroutine 13 [running]:
os.(*File).Name(...)
    os/file.go:56
github.com/containers/podman/v4/pkg/errorhandling.CloseQuiet(0xc00028a420?)
    github.com/containers/podman/v4/pkg/errorhandling/errorhandling.go:74 +0x5b
github.com/containers/podman/v4/libpod.(*Runtime).setupRootlessPortMappingViaRLK(0xc00026a540, 0xc000441200, {0xc00021df00, 0x3f}, 0xc0003e1601?)
    github.com/containers/podman/v4/libpod/networking_slirp4netns.go:581 +0x105d
github.com/containers/podman/v4/libpod.(*Container).setupRootlessNetwork(0xc000441200)
    github.com/containers/podman/v4/libpod/container_internal_linux.go:414 +0x13c
github.com/containers/podman/v4/libpod.(*Container).handleRestartPolicy(0xc000441200, {0x55c1fba89510, 0xc000130020})
    github.com/containers/podman/v4/libpod/container_internal.go:296 +0x445
github.com/containers/podman/v4/libpod.(*Container).Cleanup(0xc000441200, {0x55c1fba89510, 0xc000130020})
    github.com/containers/podman/v4/libpod/container_api.go:726 +0x3dd
github.com/containers/podman/v4/pkg/domain/infra/abi.(*ContainerEngine).ContainerStop.func1(0xc000441200)
    github.com/containers/podman/v4/pkg/domain/infra/abi/containers.go:248 +0x24e
github.com/containers/podman/v4/pkg/parallel/ctr.ContainerOp.func1()
    github.com/containers/podman/v4/pkg/parallel/ctr/ctr.go:28 +0x22
github.com/containers/podman/v4/pkg/parallel.Enqueue.func1()
    github.com/containers/podman/v4/pkg/parallel/parallel.go:67 +0x1ac
created by github.com/containers/podman/v4/pkg/parallel.Enqueue
    github.com/containers/podman/v4/pkg/parallel/parallel.go:56 +0xbe
exit code: 2

So this might be actually caused by podman-compose or I am very unlucky and it only happens when using the podman-compose commands (the error does not occur 100% of the time in the first place)

Should this maybe be moved to the podman-compose repo ?

jukefr commented 1 year ago

anecdotally, removing all of the restart: always from the compose file seems to make the issue go away (or at least i was not able to reproduce, the issue does not happen 100% of the time)

francoism90 commented 1 year ago
podman start -a mailpit
Error: unable to start container: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 125
podman start -a nginx
Error: unable to start container: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 125

It doesn't seem podman-compose down stops the instance:

rootlessp 774430        me   11u  IPv6 3344113      0t0  TCP *:80 (LISTEN)
muayyad-alsadi commented 1 year ago

@francoism90 you seems to have many containers, there are no free ips

For example you run a container, then stop it without removing it

To confirm this type

podman ps -a

Just remove those stopped containers

francoism90 commented 1 year ago

@muayyad-alsadi Thanks for your help, unfortunately I'm unable to stop all containers:

podman stop $(podman ps -q)

It keeps stating the same error, when running podman-compose up again.

To confirm no instances are running:

$ podman ps -a

CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES

The only way to get it working again, is to reboot.

francoism90 commented 1 year ago

@muayyad-alsadi More debug:

ERRO[0000] IPAM error: failed to get ips for container ID 09755daf17f0f85b5399e7c4bb264fdfca70c0a82a01494554e6547b6b39efa0 on network bridge 
[ERROR netavark::network::bridge] failed to parse ipam options: no static ips provided
ERRO[0000] IPAM error: failed to find ip for subnet 10.89.0.0/24 on network hub_bridge 
ERRO[0000] Unable to clean up network for container 09755daf17f0f85b5399e7c4bb264fdfca70c0a82a01494554e6547b6b39efa0: "tearing down network namespace configuration for container 09755daf17f0f85b5399e7c4bb264fdfca70c0a82a01494554e6547b6b39efa0: netavark: netavark encountered multiple errors:\n\t- IO error: aardvark pid not found\n\t- failed to delete container veth eth0: Netlink error: No such device (os error 19)" 
Error: unable to start container 09755daf17f0f85b5399e7c4bb264fdfca70c0a82a01494554e6547b6b39efa0: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
exit code: 12

I'm using this:

networks:
  bridge:
    driver: bridge
muayyad-alsadi commented 1 year ago
podman ps -a | wc -l
francoism90 commented 1 year ago

@muayyad-alsadi

$ podman ps -a | wc -l
1

It seems to happen because I'm doing podman-compose up and podman-compose down a lot.

d0b3rm4n commented 1 year ago

I got also the issue that containers suddenly didn't start any more (while doing many podman-compose up and podman-compose down) and errors like above all over: Error: unable to start container "fd9a50aff4025ada009883a6b67a3229611236191be7d0343d026dc24ab1d8e8": IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254

Also deleting the problematic network didn't help, but what helped was to set it up by hand with a different IP range (not sure how long it will work, but at least for the moment it does):

podman-compose --version
['podman', '--version', '']
using podman version: 4.5.0
podman-composer version  1.0.3
podman --version 
podman version 4.5.0

podman network rm reportportal_default
podman network create --driver bridge --subnet 192.168.33.0/24 --gateway 192.168.33.1  reportportal_default
francoism90 commented 1 year ago

@d0b3rm4n Yeah, the only workaround seems to reboot.

I don't know if any command exists to reset the network stack. It seems to be caused by containers not starting, when using a depends_on and/or crashed due other issues.