Closed dustymabe closed 5 years ago
I can't reproduce this on podman 1.0.0 on Fedora Atomic Host and podman 1.1.2 on Fedora Silverblue 29 and on podman 1.1.2 on Arch Linux.
Any chance you are running as root? I'm running rootless.
$ podman run -it --rm registry.fedoraproject.org/fedora:29
Trying to pull registry.fedoraproject.org/fedora:29...
Getting image source signatures
Copying blob 8dba660c242f [======================================] 92.5MiB / 92.5MiB
Copying config 81174df11a [======================================] 1.3KiB / 1.3KiB
Writing manifest to image destination
Storing signatures
[root@62f6d33070e0 /]#
[root@8f825a2f1462 /]# [root@62f6d33070e0 /]#
bash: [root@62f6d33070e0: command not found
[root@8f825a2f1462 /]# exit
exit
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$ rpm -q podman podman-1.1.2-1.git0ad9b6b.fc29.x86_64
Does not look like an issue in podman 1.2
My tests were also rootless. Can you actually remove that container manually (right after it was not autoremoved)?
anything you all would like me to do to give you debug information?
Can you actually remove that container manually (right after it was not autoremoved)?
yes. I don't have any problems removing it. It just doesn't get cleaned up like it should
@dustymabe There was a potential race condition we fixed recently where containers were not removed by the time the podman run
command completed, but shortly afterwards. Is there any significant delay between run
and ps
when you test this? If you wait a few seconds before running ps
has the container been removed?
Nope. They stay around for a long time it seems. All of these were created with --rm
:
[dustymabe@media sync2jira (dusty *%)]$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
dd0a59d67e88 localhost/sync2jira:latest bash 2 hours ago Exited (0) 11 seconds ago xenodochial_ptolemy
ab290c106a67 localhost/sync2jira:latest bash 2 hours ago Exited (1) 2 hours ago tender_stonebraker
edbca2f8c36e localhost/sync2jira:latest bash 2 hours ago Exited (1) 2 hours ago inspiring_bardeen
62f6d33070e0 registry.fedoraproject.org/fedora:29 /bin/bash 3 hours ago Exited (0) 3 hours ago festive_davinci
Alright. Can you run a container with --syslog
and --log-level=debug
both set, and paste anything that pops in in journalctl after the container exits?
Here is the output from podman run -it --rm --syslog --log-level=debug registry.fedoraproject.org/fedora:29 echo hello &> /tmp/output.txt
:
Here is my journal from the time period of the run:
Absolutely nothing after Checking container $ID status...
?
It appears that cleanup processes are not running, but I'm really not sure as to what would cause that. It's added as an atexit()
and Conmon doesn't seem to be exiting due to an error.
Any SELinux AVCs?
It never logs after "Checking container $ID status", not even when it removes the container.
Wild guess and I could ge the whole "run" thing wrong: Cleanup starts here https://github.com/containers/libpod/blob/v1.1.2/cmd/podman/run.go#L144 it uses removeContainer to actually remove the container, but removeContainer have log-less non-error exit here: https://github.com/containers/libpod/blob/v1.1.2/libpod/runtime_ctr.go#L257 since logs show that boltdb is used, it means that this https://github.com/containers/libpod/blob/v1.1.2/libpod/boltdb_state.go#L445 is called and for some reaosn might return false. Could ~/.local/share/containers/storage/libpod/bolt_state.db be corrupted? Everything else would cause log at https://github.com/containers/libpod/blob/v1.1.2/cmd/podman/run.go#L145
BoltDB corruption would take down most all of Podman, so I doubt it's that.
Personal suspicion is that the exit command is either not running, or is running and not logging despite --syslog
being specified.
Absolutely nothing after
Checking container $ID status...
?
not really:
Mar 14 15:41:30 media podman[22527]: time="2019-03-14T15:41:30-04:00" level=debug msg="Checking container f7d2bdb1b35e0871ebd694c47992974587e238af905830a7b5afe2e85f86d2ca status..."
Mar 14 15:41:35 media sudo[22428]: pam_unix(sudo:session): session closed for user root
Mar 14 15:41:35 media audit[22428]: USER_END pid=22428 uid=0 auid=1001 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:session_close grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/13 res=success'
Mar 14 15:41:35 media audit[22428]: CRED_DISP pid=22428 uid=0 auid=1001 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/13 res=success'
Mar 14 15:46:30 media gnome-terminal-[3671]: gnome-terminal-server has no capability of surrounding-text feature
The pam_unix(sudo:session): session closed for user root
was me CTRL-C
out of my sudo journalctl -f
and then there is a 5 minute delay until the next log message.
It appears that cleanup processes are not running, but I'm really not sure as to what would cause that. It's added as an
atexit()
and Conmon doesn't seem to be exiting due to an error.Any SELinux AVCs?
Nope
Is this still an issue?
Doesn't seem like it:
[dustymabe@media ansible (master %=)]$ rpm -q podman
podman-1.2.0-2.git3bd528e.fc29.x86_64
[dustymabe@media ansible (master %=)]$
[dustymabe@media ansible (master %=)]$ podman run -it --rm registry.fedoraproject.org/fedora:29
[root@867a35d00bc0 /]# exit
exit
[dustymabe@media ansible (master %=)]$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[dustymabe@media ansible (master %=)]$
/kind bug
Description
--rm
doesn't seem to remove containers after execution:Output of
podman version
:Output of
podman info --debug
: