containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.27k stars 2.37k forks source link

static ip address is already allocated, but that container is already deleted #21194

Closed khain0 closed 5 months ago

khain0 commented 8 months ago

Issue Description

After rebooting a server a rootless container is spawned by systemd service and failed with information: IP address is already allocated.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Prepare user which wants to run container by systemd service a) Enable linger for the user loginctl enable-linger user_name b) Delegate user ids range for child namespaces Add below line to files: vi -o /etc/subuid /etc/subgid user_name:362144:65536

  2. Create an nginx container: /usr/bin/podman run \ --rm \ --replace \ -d \ --name nginx \ -h nginx \ -p 443:443 \ -p 80:80 \ --network=wind:ip=10.89.0.12 \ -v /home/windserviceuser/nginx/wind:/var/www/wind:Z \ -v /home/windserviceuser/nginx/conf:/etc/nginx/conf.d:Z \ -v /home/windserviceuser/nginx/certs:/etc/ssl/certs:Z nginx

  3. Generate systemd files: podman generate systemd --name windserver --new --files

  4. Create folder and move it to user's home directory mkdir -p ~/.config/systemd/user/ mv /home/windserviceuser/container-windserver.service ~/.config/systemd/user/

  5. ssh to server with user you want to run container (switch user will not work, you need to establish ssh session as user who wants to run container by systemd service)

  6. Start systemd service systemctl --user enable --now container-windserver.service

  7. Reboot the server

Describe the results you received

podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f35da1d2c2ae docker.io/library/nginx:latest nginx -g daemon o... Created 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp nginx podman start nginx Error: IPAM error: requested ip address 10.89.0.12 is already allocated to container ID 1e967bae12d2cf693e9e0ed3fa3593c694db671f994444a2d8bf6d0b942cee13

Describe the results you expected

After server is rebooted the container should be already started by systemd user service.

podman info output

Red Hat Enterprise Linux release 9.3 (Plow)
podman version 4.6.1

/usr/libexec/podman/aardvark-dns version
{
  "version": "1.7.0",
  "commit": "42349ad219a890de8f447aeae6d580eb8dc3886c",
  "build_time": "2023-07-07T00:00:00+00:00",
  "target": "x86_64-unknown-linux-gnu"
}

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

Cannot use latest upstream release on RHEL production server.

Additional information

Issue occurs randomly. Not every reboot the IP address is already allocated. Workaround after issue is encountered: rm /tmp/containers-user-1041/containers/networks/ipam.db rm /tmp/containers-user-1041/containers/networks/ipam.db ps aux |grep aardvark kill -15 173122 /usr/libexec/podman/aardvark-dns --config /tmp/containers-user-1041/containers/networks/aardvark-dns -p 53 run systemctl --user restart container-windserver.service

khain0 commented 8 months ago

Seems this is a regression: https://github.com/containers/podman/issues/15708

Luap99 commented 8 months ago

Is /tmp on a tmpfs and thus empty after the reboot? (Also the path should really use /run/user/... on a systemd distro, likely when you first used podman as this user you did not log in correctly and no systemd session was created. Thus podman uses the /tmp fallback, I recommend you run podman system reset and then run podman info and then make sure runroot is set to /run/user/$UID... then try again)

khain0 commented 8 months ago

/tmp is mounted under LVM partition: df -h /tmp Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel_rhel9--master-tmp 5.0G 70M 5.0G 2% /tmp

Path is set correctly to /run/user/... podman info |grep -i runroot runRoot: /run/user/1153650084/containers

Luap99 commented 8 months ago

Your workaround mentions rm /tmp/containers-user-1041/containers/networks/ipam.db so when the problem happens it was under /tmp. You have to make sure this is always deleted after boot.

khain0 commented 8 months ago

As I understand after performing podman system reset ipam.db should be created under /run/user... after starting the container, correct?

Luap99 commented 8 months ago

podman system reset should delete everything podman related and thus should include the ipam.db. But system reset is not really important, the important bit is that this directory must on a tmpfs or manually cleared on each boot before you run podman, other critical functionality depends on that fact as well.

Luap99 commented 8 months ago

The ipam location should be $XDG_RUNTIME_DIR/containers/networks for rootless assuming defaults are used.

github-actions[bot] commented 7 months ago

A friendly reminder that this issue had no activity for 30 days.

fccagou commented 7 months ago

Got the same pb, Ip are never removed from ipam.db nor from aardvark-dns when container or network are stopped/rm/prune.

Versions

$ podman --version
podman version 4.9.3
$ podman info  --format "{{.Host.NetworkBackendInfo.Path}}"
/usr/lib/podman/netavark
$ podman info  --format "{{.Host.NetworkBackendInfo.Version}}"
netavark 1.10.3

Reproducer


# Create networks
podman network create --label io.podman.test=test_multinetworks --driver bridge --subnet '10.11.11.0/24' test_multi_net11
podman network create --label io.podman.test=test_multinetworks --driver bridge --subnet '10.22.22.0/24' test_multi_net22

# Run busybox in each network
podman run --name=test_run11 --rm -d --label io.podman.test=test_multinetworks --net test_multi_net11 busybox  sh -c 'while : ; do printf -- "test_run11: %s\n" "$(ip r s)"; sleep 1; done'
podman run --name=test_run22 --rm -d --label io.podman.test=test_multinetworks --net test_multi_net22 busybox  sh -c 'while : ; do printf -- "test_run22: %s\n" "$(ip r s)"; sleep 1; done'

# Get ipadress (always increment)
podman inspect test_run11 --format '{{range .NetworkSettings.Networks}}{{.NetworkID}} {{.IPAddress }}{{end}}'
podman inspect test_run22 --format '{{range .NetworkSettings.Networks}}{{.NetworkID}} {{.IPAddress }}{{end}}'

# Run busybox in 2 network using static ip address.
podman run --name=test_run --rm -d --label io.podman.test=test_multinetworks --net test_multi_net11:ip=10.11.11.11 --net test_multi_net22:ip=10.22.22.22  --requires=test_run11,test_run22 busybox  sh -c 'while : ; do printf -- "test_run: %s\n" "$(ip r s)"; sleep 1; done'

It fails with same error as #15708

Checks

# Remove all containers and networks
podman stop test_run11 test_run22 
podman network rm test_multi_net11 test_multi_net22

# Dns configuration still there

ls  -1 /run/user/1000/containers/networks/aardvark-dns/test_*
/run/user/1000/containers/networks/aardvark-dns/test_multi_net11
/run/user/1000/containers/networks/aardvark-dns/test_multi_net22

# Show dns entries

cat /run/user/1000/containers/networks/aardvark-dns/test_multi_net11 
10.11.11.1
9e530d00dde7af14cd63da57c05103de1b8a065e11c8c4022e49f231332e28c0 10.11.11.2  test_run11,9e530d00dde7
0d18bdf86dfc3ff50f52d671edad50b3fb555b89294dd6c98d3f369e9c60c400 10.11.11.3  test_run11,0d18bdf86dfc
daed369acb93e9c5b224689d3b96718e766e6aee3b6b9b1ea4f64c1cfe46423c 10.11.11.4  test_run11,daed369acb93
940b91fd5912d46fdb629b6c0343df4aeefcc3472209dd6168c870fd1c82d3f3 10.11.11.5  test_run11,940b91fd5912
...

# all entries are in ipam

boltbrowser -ro /run/user/1000/containers/networks/ipam.db
...

I's like if deallocIPs is never called in ipam.go

Luap99 commented 7 months ago

@fccagou I don't have much time right now to look into this, if you only run the command listed there then I don't see how it would fail. I tried it real quick but it seems to work fine for me? You could try adding --log-level debug all commands and look for some errors there, also check in the journal for errors.

fccagou commented 7 months ago

Thx @Luap99

You are right, everything is ok with this reproducer... The error occurs while I'm using podman-compose, I definitely made some mistakes during my testing :thinking:

I'm new using podman, interactions between all components is not clear in my mind. It looks like netawark doesn't receive teardown message when containers are suppressed through podman-compose.

I keep looking !

fccagou commented 7 months ago

Got this test reproducing the problem.

# Create the network
podman network --log-level debug create --driver bridge --subnet 10.11.11.0/24 mynet
# Create the container
podman create --log-level debug --name=mytest --network mynet:ip=10.11.11.10 docker.io/library/busybox sleep 10
# Start the container : first time is ok
podman start --log-level debug mytest
# Re start the container  fails
podman start --log-level debug mytest
...
many debug logs
...
Error: unable to start container "ef9691b9fbbacf0caf385b5c4bbbe9ead8fa7c8bf945cf88e7a20efe71e7ff65": IPAM error: requested ip address 10.11.11.10 is already allocated to container ID ef9691b9fbbacf0caf385b5c4bbbe9ead8fa7c8bf945cf88e7a20efe71e7ff65 time="2024-02-25T17:54:39+01:00" level=debug msg="Shutting down engines"

Here is the script wrote to produce full logs

#!/usr/bin/bash

set -o errexit
#set -o pipefail
set -o nounset

mybanner () {
        printf --  "\n\n================================================================================================================\n"
}

mybanner
podman network --log-level debug create --driver bridge --subnet 10.11.11.0/24 mynet

mybanner
podman create --log-level debug --name=mytest --network mynet:ip=10.11.11.10 docker.io/library/busybox sleep 10

mybanner
podman start --log-level debug mytest

sleep 15

mybanner
podman start --log-level debug mytest

# clean all
podman rm mytest
podman network rm mynet
kill $(pidof aardvark-dns)
rm -rf /run/user/"$(id -u)"/containers/networks/*
fccagou commented 7 months ago

hum ... Ok, I can't start a container with static ip twice, I have to recreate it.

Is it my lack of knowledge (and I'm out of topic there) or is this a bug ?

Luap99 commented 7 months ago

That should work, this looks like the podman container cleanup process is not working for you, check journal for any logs. Before the second start can you do a podman inspect and provide the full output. Also if you run podman stop before it does it work?

fccagou commented 7 months ago

It's ok with stop, netvark "teardrops" : simple-reproducer-with-inspect-stop.log

#!/usr/bin/bash

set -o errexit
#set -o pipefail
set -o nounset

mybanner () {
        printf --  "\n\n================================================================================================================\n"
}

mybanner
podman network --log-level debug create --driver bridge --subnet 10.11.11.0/24 mynet

mybanner
podman create --log-level debug --name=mytest --network mynet:ip=10.11.11.10 docker.io/library/busybox sleep 10

mybanner
podman start --log-level debug mytest

mybanner
podman inspect mytest

mybanner
sleep 2

mybanner
podman stop --log-level debug mytest

mybanner
podman start --log-level debug mytest || /bin/true

mybanner
sleep 15

# mybanner
# podman rm mytest
#
# mybanner
# podman create --log-level debug --name=mytest --network mynet:ip=10.11.11.10 docker.io/library/busybox sleep 10

mybanner
podman rm mytest
podman network rm mynet
kill $(pidof aardvark-dns)
rm -rf /run/user/"$(id -u)"/containers/networks/*
Luap99 commented 7 months ago

Please check your journal logs, podman container cleanup is not working at all for you it seems. If that doesn't yield anything attach strace to the podman run process strace -f -o log podman run ... this should show us exactly what it is doing and why it fails.

fccagou commented 7 months ago

Got errors in journal that looks interesting Full logs

...
févr. 26 19:14:55 l1nux podman[1203637]: 2024-02-26 19:14:55.155650342 +0100 CET m=+0.052614522 container died 97e5b67b8ff3ea386d6cb2ced12b539069d1edd1d063197478c3768d21aa2c2f (image=docker.io/library/busybox:lat
est, name=mytest)
févr. 26 19:14:55 l1nux /usr/bin/podman[1203637]: time="2024-02-26T19:14:55+01:00" level=debug msg="Cleaning up container 97e5b67b8ff3ea386d6cb2ced12b539069d1edd1d063197478c3768d21aa2c2f"
févr. 26 19:14:55 l1nux /usr/bin/podman[1203637]: time="2024-02-26T19:14:55+01:00" level=debug msg="Tearing down network namespace at /run/user/1000/netns/netns-c6ed13b8-9615-bac9-d576-130ce0370e3e for container 
97e5b67b8ff3ea386d6cb2ced12b539069d1edd1d063197478c3768d21aa2c2f"
févr. 26 19:14:55 l1nux /usr/bin/podman[1203637]: time="2024-02-26T19:14:55+01:00" level=error msg="Unable to clean up network for container 97e5b67b8ff3ea386d6cb2ced12b539069d1edd1d063197478c3768d21aa2c2f: \"get
ting rootless network namespace: failed to Statfs \\\"/run/user/1000/netns/rootless-netns-8efa473fa70e9f64eba2\\\": no such file or directory\""
...
fccagou commented 7 months ago

For every next tests, the same errors occurs checking unknown file /run/user/1000/netns/rootless-netns-8efa473fa70e9f64eba2

Exiting one is

$ ls -al /run/user/1000/netns/rootless-*
trw-------  1 fccagou fccagou   0 26 févr. 19:03 rootless-netns-7571ed7d42a2d3c62564
 journalctl --since 19:14:44 | grep rootless-netns-8efa473fa70e9f64eba2 
févr. 26 19:14:55 l1nux /usr/bin/podman[1203637]: time="2024-02-26T19:14:55+01:00" level=error msg="Unable to clean up network for container 97e5b67b8ff3ea386d6cb2ced12b539069d1edd1d063197478c3768d21aa2c2f: \"getting rootless network namespace: failed to Statfs \\\"/run/user/1000/netns/rootless-netns-8efa473fa70e9f64eba2\\\": no such file or directory\""
févr. 26 19:29:57 l1nux /usr/bin/podman[1206528]: time="2024-02-26T19:29:57+01:00" level=error msg="Unable to clean up network for container 026f57916f70f908dafed111e439e49003b227cb02db61be907f70cc1930582e: \"getting rootless network namespace: failed to Statfs \\\"/run/user/1000/netns/rootless-netns-8efa473fa70e9f64eba2\\\": no such file or directory\""
févr. 26 19:32:27 l1nux /usr/bin/podman[1207214]: time="2024-02-26T19:32:27+01:00" level=error msg="Unable to clean up network for container 35fa09c7f38809ef50d7e0461a18e85502a6e7693b1d1a01d073493021ad8673: \"getting rootless network namespace: failed to Statfs \\\"/run/user/1000/netns/rootless-netns-8efa473fa70e9f64eba2\\\": no such file or directory\""
févr. 26 19:37:12 l1nux /usr/bin/podman[1208364]: time="2024-02-26T19:37:12+01:00" level=error msg="Unable to clean up network for container 397b6bb4de4f0f8fb33e13a8a41c7906e707b0f8ca8ad5733181435bc2c19c16: \"getting rootless network namespace: failed to Statfs \\\"/run/user/1000/netns/rootless-netns-8efa473fa70e9f64eba2\\\": no such file or directory\""
fccagou commented 7 months ago

After reboot, the error still occurs looking at file rootless-netns- 8efa473fa70e9f64eba2 Where does this ID come from ?

fccagou commented 7 months ago

Got it in libpod/networking_linux.go

358-    hash := sha256.Sum256([]byte(r.config.Engine.StaticDir))
359:    netnsName := fmt.Sprintf("%s-%x", rootlessNetNsName, hash[:10])

Ok, one step beyond to try to leave the Madness ...

After some code reading, I think the pb appears because my storage directory is a link /home/fccagou/.local/share/containers/storage -> /srv/data/fccagou/containers/storage because the 2 hash match.

$ (id="$(echo -n /home/fccagou/.local/share/containers/storage/libpod  | sha256sum )"; echo ${id:0:20})
8efa473fa70e9f64eba2
$ (id="$(echo -n /srv/data/fccagou/containers/storage/libpod  | sha256sum )"; echo ${id:0:20})
7571ed7d42a2d3c62564

I don't now the source of the error now but the value of config.Engine.StaticDir seems different when it's called in setUpNetwork and teardownNetworkBackend.

Too much inception ...

Luap99 commented 7 months ago

yes this seems to be the problem for you Anyhow the whole rootless-netns code has been rewritten for podman 5.0 and it will no longer use the sha from the static dir but rather just a file in the rundir and there it will not matter if symlink or not so I consider this fixed.

I don't think however that is the same problem @khain0 is seeing.

fccagou commented 7 months ago

Ok.

I made a patch for my personal use but no pull-request, it's available there

Thanks for the time

Luap99 commented 5 months ago

I assume this is fixed in 5.0, however the runroot/tmpdir for podman still must be on tmpfs so that the ipam db is deleted on reboot.