Closed tyler92 closed 6 months ago
Does podman ps --external
show a container with that name? Could be a leftover from Buildah, or an incomplete removal.
podman rm --force 00bdd4123646ff7fe12c51ed4be7b508381b9966c593d928c1bde75912b2859e
will get rid of it.
Does podman ps --external show a container with that name?
Can't check now (will check after new reproduction).
podman rm --force 00bdd4123646ff7fe12c51ed4be7b508381b9966c593d928c1bde75912b2859e will get rid of it.
Unfortunately no. After this command, the issue still exists.
Could be a leftover from Buildah, or an incomplete removal.
I don't use Buildah in this case.
While I'm trying to reproduce this issue - is it possible to fix Podman in such a way that podman rm
will always succeed even after incomplete removal?
While I'm trying to reproduce this issue - is it possible to fix Podman in such a way that
podman rm
will always succeed even after incomplete removal?
That's the idea. Internally, there is an intermediate "removing" state indicating that removal has started but not finished yet.
A friendly reminder that this issue had no activity for 30 days.
I can't reproduce this issue with the same version of Podman again, but I'm still trying.
Ok I am going to close, reopen issue if you get a reproducer or even get it to happen again.
I'm seeing the same issue across a number of systems pretty regularly on 3.4.2.
The issue persists. I have seen this after I failed in creation of Adguard Home container because some ports were already used. Running container first time ended up with error and this "ghost" stayed. I wasn't able to create it one more time with this particular name. I had to choose another name or error came up.podman ps -a
and podman ps --external
showed nothing. And podman rm --force
not helped too
@mrmac189, do you have a reproducer? Can you share the output of podman info
?
@vrothberg, I believe to reproduce it, we must create a new container in detached mode, but it should be facing some kind of error while creating. Docker in this situation would just list this container in docker ps -a
, but for some reason Podman doesn't do it. It becomes "ghost".
In my situation I was creating Adguard Home instance in detached mode, but port 53 was unavailable and running was not successful.
Error: cannot listen on the TCP port: listen tcp4 :53: bind: address already in use
After I solved issue, this error came up:
Error: creating container storage: the container name "adguardhome" is already in use by f82dc9a084dea01459b11305ee695d2867797f94212bbd65c44d412fe463f808. You have to remove that container to be able to reuse that name: that name is already in use
As I said, neither podman ps
nor rm --force
showed anything and helped.
host:
arch: arm64
buildahVersion: 1.30.0
cgroupControllers:
- cpuset
- cpu
- io
- memory
- pids
- rdma
- misc
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.7-2.fc38.aarch64
path: /usr/bin/conmon
version: 'conmon version 2.1.7, commit: '
cpuUtilization:
idlePercent: 98.62
systemPercent: 0.66
userPercent: 0.72
cpus: 2
databaseBackend: boltdb
distribution:
distribution: fedora
version: "38"
eventLogger: journald
hostname: fedora
idMappings:
gidmap: null
uidmap: null
kernel: 6.3.8-200.fc38.aarch64
linkmode: dynamic
logDriver: journald
memFree: 2037714944
memTotal: 4087611392
networkBackend: netavark
ociRuntime:
name: crun
package: crun-1.8.5-1.fc38.aarch64
path: /usr/bin/crun
version: |-
crun version 1.8.5
commit: b6f80f766c9a89eb7b1440c0a70ab287434b17ed
rundir: /run/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
remoteSocket:
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.0-12.fc38.aarch64
version: |-
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.3
swapFree: 4087345152
swapTotal: 4087345152
uptime: 0h 38m 15.00s
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /usr/share/containers/storage.conf
containerStore:
number: 3
paused: 0
running: 0
stopped: 3
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev,metacopy=on
graphRoot: /var/lib/containers/storage
graphRootAllocated: 26050822144
graphRootUsed: 11057975296
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
imageCopyTmpDir: /var/tmp
imageStore:
number: 1
runRoot: /run/containers/storage
transientStore: false
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 4.5.1
Built: 1685123899
BuiltTime: Fri May 26 19:58:19 2023
GitCommit: ""
GoVersion: go1.20.4
Os: linux
OsArch: linux/arm64
Version: 4.5.1
Thanks. I will reopen. @mrmac189, could you share the exact command line arguments you used to create the container?
@vrothberg Sorry, I have also seen, that reinstalling Podman apparently helped too. These "ghosts" are now appearing in podman ps -a
and thus can be successfully removed.
Anyway, the command was podman run --name abguardhome --restart unless-stopped -v adg_wdir:/opt/adguardhome/work -v adg_cdir:/opt/adguardhome/conf -p 53:53/tcp -p 53:53/udp -p 67:67/udp -p 80:80/tcp -p 443:443/tcp -p 443:443/udp -p 3000:3000/tcp -p 853:853/tcp -p 784:784/udp -p 853:853/udp -p 8853:8853/udp -p 5443:5443/tcp -p 5443:5443/udp -d adguard/adguardhome
@mrmac189 thanks for sharing. That is quite curious! Did you install a newer version of Podman? Or was it the very same version?
@vrothberg I believe it was the same version, because I installed Podman first time just yesterday
Thanks! Very hard to analyze without a reproducer, so I just have a vague suspicion that there's a race condition when cleaning up after container creation has failed?
Cc: @Luap99 @giuseppe
I can not get this to reproduce, so it could be something that has been fixed since podman 4.2.
I can not get this to reproduce, so it could be something that has been fixed since podman 4.2.
@mrmac189 reported it against 4.5.1. I have a "feeling" that it's related to an incomplete cleanup when container creation failed.
Any chance this is fixed in podman 4.6?
I managed to get myself into this problem, I'm not sure if it's reproducible yet but this is what I have done:
I have started vscode in the morning (I run vscode in podman container), worked on it until late night and then shutdown my computer without closing the IDE. When I woke up in the morning and tried to start it again I see this error:
Error: creating container storage: the container name "code_arch" is already in use by ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4. You have to remove that container to be able to reuse that name: that name is already in use
When I try to remove the container podman rm --force ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4
I get this error:
WARN[0000] Unmounting container "ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4" while attempting to delete storage: unmounting "/home/grzegorz/.local/share/containers/storage/overlay/fe19dc85d8c95dd6043573511708b17a81766987397902ee5f722abb4f62750f/merged": invalid argument
Error: removing storage for container "ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4": unmounting "/home/grzegorz/.local/share/containers/storage/overlay/fe19dc85d8c95dd6043573511708b17a81766987397902ee5f722abb4f62750f/merged": invalid argument
However running below fixed the problem:
podman rm --storage ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4
I'm on podman 4.6.0
The real question would be how you got yourself into this state. One way would be using a different container tool like buildah, but it is unlikely that you would have named a buildah container as code_arch. I would guess that during the creation of the code_arch container, something blew up between creating the container in container/storage database and creating an entry in Podmans database, or something removed it from podman's database and blew up before removing it from container/storage database.
@benoitf Does podman-desktop have a mechanism to remove a container from storage, even if it is not in podmans database?
@rhatdan we allow to remove containers being listed when calling the /containers/json
REST API
We also allow to run the 'prune' command on containers. But then I don't think it can be displayed if it's not in podman's database
Hi @rhatdan, I managed to get into this state again and by doing the same thing as before - I shutdown my computer without closing the IDE. The only difference is that this time I was able remove the container by ID (which showed up by listing external containers):
$ podman run -d --rm \
--shm-size 2g \
--network host \
--name "code_arch" \
--userns=keep-id \
--security-opt label=type:container_runtime_t \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /dev/dri:/dev/dri \
-v "/home/grzegorz"/.Xauthority:"/home/grzegorz/.Xauthority":Z \
--device /dev/video0 \
-e DISPLAY \
-e XAUTHORITY \
-v /tmp/xauth_UbPEjD:/tmp/xauth_UbPEjD \
-v /etc/machine-id:/etc/machine-id \
-v "/home/grzegorz"/.config/pulse/cookie:/home/grzegorz/.config/pulse/cookie \
-v /run/user/1000/pulse:/run/user/1000/pulse \
-v /var/lib/dbus:/var/lib/dbus \
--device /dev/snd \
-e PULSE_SERVER=unix:/run/user/1000/pulse/native \
-v /run/user/1000/pulse/native:/run/user/1000/pulse/native \
x11_arch
Error: creating container storage: the container name "code_arch" is already in use by 70edb2b748adfe3422271f35b4391da6d6f18414f34cb0cf34fcacabd31117c1. You have to remove that container to be able to reuse that name: that name is already in use
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$ podman ps -a --external
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
70edb2b748ad localhost/x11_arch_20230907:latest storage 12 hours ago Storage code_arch
$ podman rm 70edb2b748ad
70edb2b748ad
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$ podman ps -a --external
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
This time I could remove the container without any problem, it somehow made it into external containers. Removal was not blocked, so I'd say I would only be confused by podman ps -a
not showing that container.
Is there any info I can collect to help with this issue? It looks like I can reproduce some parts of it.
@mheon Do you think these are all more cases of the boltdb issues. Where we delete a container but before storage is removed the VM crashes?
The last one here from @grzegorzk looks like a container that was successfully removed from the Podman DB, but not the c/storage DB; hence, podman rm
was able to evict it as an external container. From the original error, the container was still mounted in c/storage (and our attempts to unmount for removal evidently failed?), which explains why it failed to remove. So this isn't really a bolt issue so much as an ordinary failure to remove, and much less worrying than Bolt issues (where the database gets into a sufficiently bad place that we cannot fix it without system reset).
For the original issue... I want to say that SQLite is probably a lot more resistant to random corruption, given what we saw around Bolt's (lack of) ACID guarantees earlier this year.
@giuseppe could your recent storage fixes have solved this issue?
Hey there
I think I am running into this issue
I have upgrade my system to fedora 38 yesterday This morning I have a few users who can't use my system any more and I am seeing this kind of error
# podman run --name python-mooc-x-43dde2c62132c57c9e3f6c42c4f40e28 -p 54937:8888 --user root --rm --detach --memory=12g <tons-of-further-options>
Error: creating container storage: the container name "python-mooc-x-43dde2c62132c57c9e3f6c42c4f40e28" is already in use by 95613d3ca8728bd745235b43d2fe823b3a18612013755d8becb5868c8e980001. You have to remove that container to be able to reuse that name: that name is already in use
but
# podman inspect python-mooc-x-43dde2c62132c57c9e3f6c42c4f40e28
[]
Error: no such object: "python-mooc-x-43dde2c62132c57c9e3f6c42c4f40e28"
# podman inspect 95613d3ca8728bd745235b43d2fe823b3a18612013755d8becb5868c8e980001
[]
Error: no such object: "95613d3ca8728bd745235b43d2fe823b3a18612013755d8becb5868c8e980001"
my installed version
# rpm -qa | grep podman
podman-4.7.0-1.fc38.x86_64
podman-plugins-4.7.0-1.fc38.x86_64
how can I clean the podman database ?
ps. one final note, probably not relevant but well: I did the upgrade during the day, but scheduled a reboot only around midnight so as to mitigate the impact of the downtime
I just found about podman system prune
so I gave it a go, but that did not clean my database deep enough and the issue is still there
podman ps --external
will show it. And you should be able to just remove the external cotnianer with podman rm <NAME>
thanks ! this is helpful
I've been using podman for quite some time and had never come across an 'external' container, what is that about, and what are the possible causes why am I seeing this now ?
Usually they are buildah containers, or they could be a crash that causes them. The best thing is just to remove the container.
@giuseppe could your recent storage fixes have solved this issue?
I don't think it could have any effect on this issue :/
I just ran into a similar issue.
I tried to recreate a container (using an ansible task so I don't have the exact command handy), but it failed with this error:
Error: unmounting container b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce storage: cleaning up container b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce storage: unmounting container b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce root filesystem: removing mount point "/home/jkik/.local/share/containers/storage/overlay/9acb7f85f2e6577fc19642874cee76e0ed05b2f6c3bdf41fed92608aa2afa4f4/merged": directory not empty
I re-ran that same command and it failed differently:
Error: cleaning up storage: removing container b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce root filesystem: unmounting "/home/jkik/.local/share/containers/storage/overlay/9acb7f85f2e6577fc19642874cee76e0ed05b2f6c3bdf41fed92608aa2afa4f4/merged": invalid argument
Until now you could still see the containers if you did podman ps -a
, so I tried to remove them. The rm
command failed with the same kind of error message, but next time I ran podman ps -a
it was gone and the creation failed with:
Error: creating container storage: the container name "firefly-iii-mysql" is already in use by b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce. You have to remove that container to be able to reuse that name: that name is already in use
Eventually I found I could use this command to find those phantom containers:
$ podman container list --all --external
...
b868f837d169 docker.io/fireflyiii/core:latest storage 3 months ago Storage firefly-iii
6d1dc1f924a6 docker.io/library/alpine:latest storage 3 months ago Storage firefly-iii-cron
a333b5af2b2a docker.io/fireflyiii/data-importer:latest storage 3 months ago Storage firefly-iii-data-importer
Trying to kill -f
or rm -f
was telling me:
Error: no container with name or ID "b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce" found: no such container
Only podman rm --storage b6d05a6c1021a4d236b9a2d83daea5c040b3f493917989942e84b66826fca0ce
worked to unlock this situation (as mentioned in https://github.com/containers/podman/issues/16476#issuecomment-1697405035).
$ podman version
Client: Podman Engine
Version: 4.6.1
API Version: 4.6.1
Go Version: go1.20.10
Built: Sat Nov 18 01:48:31 2023
OS/Arch: linux/amd64
The container was likely created with an older version of podman 3 months ago.
Stumbled upon this too when running some pods with quadlet.
I'm not sure how I arrived in this state, but I'm not using buildah at all if it helps to narrow down the problem.
Finding the container that was in storage
state with podman ps --external
, removing it with podman rm --storage <containerid>
, and restarting the relevant systemd quadlet generated service helped me circumvent this.
Running podman version 4.6.1
in
# cat /etc/os-release
NAME="AlmaLinux"
VERSION="9.3 (Shamrock Pampas Cat)"
ID="almalinux"
I'm trying to use podman as a docker rootless replacement for docker in a lxc container and I get this annoying error with just about every compose file. Podman seems really 0.1, not 4.3.1.
To reproduce just try https://avikdas.com/2023/08/23/containerized-services-on-a-home-server.html in a Proxmox LXC container. Real easy, happens with every compose file.
Podman seems really 0.1, not 4.3.1.
try with a newer version. 4.3.1 was released 1.5 years ago.
It is what Debian Bookworm ships and I installed it like https://podman.io/docs/installation says ... but maybe I will try a different container template.
Can confirm that with podman version 4.9.4 in latest RHEL derivatives the issue seems to be resolved
thanks for confirming it, I am closing the issue
I managed to get myself into this problem, I'm not sure if it's reproducible yet but this is what I have done:
I have started vscode in the morning (I run vscode in podman container), worked on it until late night and then shutdown my computer without closing the IDE. When I woke up in the morning and tried to start it again I see this error:
Error: creating container storage: the container name "code_arch" is already in use by ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4. You have to remove that container to be able to reuse that name: that name is already in use
When I try to remove the container
podman rm --force ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4
I get this error:WARN[0000] Unmounting container "ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4" while attempting to delete storage: unmounting "/home/grzegorz/.local/share/containers/storage/overlay/fe19dc85d8c95dd6043573511708b17a81766987397902ee5f722abb4f62750f/merged": invalid argument Error: removing storage for container "ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4": unmounting "/home/grzegorz/.local/share/containers/storage/overlay/fe19dc85d8c95dd6043573511708b17a81766987397902ee5f722abb4f62750f/merged": invalid argument
However running below fixed the problem:
podman rm --storage ae9c17f0a13c76eafa2a9b61255e0813b15023e16f56b40ed1b08fd5a40c8ed4
I'm on
podman 4.6.0
I'm on podman 4.9.4
, the step will be done. BUT it always happens
reprocedure: enviroment: windows 10 WSL fedora 39 podman desktop v1.10.3 podman v5.1.0
ghcr.io/immich-app/immich-machine-learning:release-cuda
via podman-compose up -d
, and a pod with a group of containers will be started, config is complex or just run any cuda container via podman run -it --gpus
are the same? I guesspodman-compose down
, and the cuda one container will not be deletedsome info:
[user@PC-xxx~]$ sudo podman ps --external
ERRO[0000] Unable to write system event: "write unixgram @00003->/run/systemd/journal/socket: sendmsg: no such file or directory"
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7522cd6e2c5f ghcr.io/immich-app/immich-machine-learning:release-cuda storage About an hour ago Storage immich_machine_learning
[user@PC-xxx~]$ podman rm --force 7522cd6e2c5f
Error: default OCI runtime "crun" not found: invalid argument
[user@PC-xxx~]$ sudo podman rm --force 7522cd6e2c5f
WARN[0000] Unmounting container "7522cd6e2c5f" while attempting to delete storage: removing mount point "/var/lib/containers/storage/overlay/f60fdf295173dc57642170afbbe8acc4330951b5c5d091c6dff5b6874d13be03/merged": directory not empty
Error: removing storage for container "7522cd6e2c5f": removing mount point "/var/lib/containers/storage/overlay/f60fdf295173dc57642170afbbe8acc4330951b5c5d091c6dff5b6874d13be03/merged": directory not empty
[user@PC-xxx~]$ podman rm --storage f60fdf295173dc57642170afbbe8acc4330951b5c5d091c6dff5b6874d13be03
Error: default OCI runtime "crun" not found: invalid argument
[user@PC-xxx~]$ sudo podman rm --storage f60fdf295173dc57642170afbbe8acc4330951b5c5d091c6dff5b6874d13be03
f60fdf295173dc57642170afbbe8acc4330951b5c5d091c6dff5b6874d13be03
[user@PC-xxx~]$ sudo podman rm --force 7522cd6e2c5f
[user@PC-xxx~]$ podman --version
podman version 4.9.4
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
I don't know how but my device reports that the container doesn't exit meantime I can't create a container with the same name:
And even after
podman rm -f
I still have this error. Onlypodman system reset -f
helps.Steps to reproduce the issue:
I have the following workflow but don't know the exact steps:
Remove Pod
Play kube
Reboot
Also power off could happen at any time, maybe it's the reason.
Describe the results you received: I have an unremovable container and can't create a new one with the same name.
Describe the results you expected: If
podman rm -f
will not return an error, it means that new container with this name can be created.Additional information you deem important (e.g. issue happens only occasionally):
The issue happens only occasionally. And also there is a record with my container in
storage/overlay-containers/containers.json
:Output of
podman version
:Output of
podman info
:Package info (e.g. output of
rpm -q podman
orapt list podman
orbrew info podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
No (have no idea how to reproduce)
Additional environment details (AWS, VirtualBox, physical, etc.):