Closed jskov-jyskebank-dk closed 3 years ago
This is probably the rootless pause process. It keeps the rootless user namespace alive for future Podman commands to use.
On Thu, Dec 17, 2020 at 03:16 Jesper Skov notifications@github.com wrote:
/kind bug
Description
When I run a simple podman command inside a container (i.e. no systemd), a podman sub-process is left running (sleeping).
(this is probably not a bug as such, but just surprising behavior)
Steps to reproduce the issue:
- I run a container on OpenShift 4.4 which launches tini and a python server (just to keep the Pod running).
- Run ps auxww to show live processes.
- Run podman images (process 191 in the below output)
- Run ps auxww (shows podman process 204 in the below output)
- If a new podman images command is run, process 204 remains (does not spawn a new lingering process)
Describe the results you received:
It looks like this in the console:
h-5.0$ ps auxwwUSER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMANDpodman 1 0.0 0.0 2344 736 ? Ss 08:25 0:00 /tini -- python3 -m http.serverpodman 7 0.0 0.0 21136 15752 ? S 08:25 0:00 python3 -m http.serverpodman 8 0.0 0.0 4788 4172 pts/0 Ss 08:25 0:00 shpodman 190 0.0 0.0 6736 3200 pts/0 R+ 08:46 0:00 ps auxwwsh-5.0$ podman --events-backend none images &[1] 191sh-5.0$ REPOSITORY TAG IMAGE ID CREATED SIZE [1]+ Done podman --events-backend none imagessh-5.0$ ps auxwwUSER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMANDpodman 1 0.0 0.0 2344 736 ? Ss 08:25 0:00 /tini -- python3 -m http.serverpodman 7 0.0 0.0 21136 15752 ? S 08:25 0:00 python3 -m http.serverpodman 8 0.0 0.0 4788 4172 pts/0 Ss 08:25 0:00 shpodman 204 0.0 0.0 48868 23316 ? S 08:46 0:00 podmanpodman 217 0.0 0.0 6736 3100 pts/0 R+ 08:47 0:00 ps auxwwsh-5.0$ podman --events-backend none images &[1] 218sh-5.0$ REPOSITORY TAG IMAGE ID CREATED SIZE [1]+ Done podman --events-backend none imagessh-5.0$ ps auxwwUSER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMANDpodman 1 0.0 0.0 2344 736 ? Ss 08:25 0:00 /tini -- python3 -m http.serverpodman 7 0.0 0.0 21136 15752 ? S 08:25 0:00 python3 -m http.serverpodman 8 0.0 0.0 4788 4172 pts/0 Ss 08:25 0:00 shpodman 204 0.0 0.0 48868 23316 ? S 08:46 0:00 podmanpodman 239 0.0 0.0 6736 3176 pts/0 R+ 08:47 0:00 ps auxww
Describe the results you expected:
I would expect no lingering sub-processes from podman, after it completes the list.
I wonder if it is related to the "systemd lingering" support, or maybe the server mode? It does not appear related to the API service, as this needs to be started explicitly.
Since the second invocation does not create a new subprocess, I assume this is something intentional. But then it should maybe be mention in the docs/trouble shooting guide. And/or be controllable with an option.
Output of podman version:
Version: 2.2.1 API Version: 2.1.0 Go Version: go1.15.5 Built: Tue Dec 8 15:37:50 2020 OS/Arch: linux/amd64
Output of podman info --debug:
host: arch: amd64 buildahVersion: 1.18.0 cgroupManager: cgroupfs cgroupVersion: v1 conmon: package: conmon-2.0.21-3.fc33.x86_64 path: /usr/bin/conmon version: 'conmon version 2.0.21, commit: 0f53fb68333bdead5fe4dc5175703e22cf9882ab' cpus: 4 distribution: distribution: fedora version: "33" eventLogger: journald hostname: podman-test-749877c57f-p9cfw idMappings: gidmap:
- container_id: 0 host_id: 1000 size: 1
- container_id: 1 host_id: 10000 size: 65536 uidmap:
- container_id: 0 host_id: 1000 size: 1
- container_id: 1 host_id: 10000 size: 65536 kernel: 4.18.0-193.28.1.el8_2.x86_64 linkmode: dynamic memFree: 2864472064 memTotal: 33724624896 ociRuntime: name: runc package: runc-1.0.0-279.dev.gitdedadbf.fc33.x86_64 path: /usr/bin/runc version: |- runc version 1.0.0-rc92+dev commit: c9a9ce0286785bef3f3c3c87cd1232e535a03e15 spec: 1.0.2-dev os: linux remoteSocket: path: /tmp/podman-run-1000/podman/podman.sock rootless: true slirp4netns: executable: /usr/bin/slirp4netns package: slirp4netns-1.1.8-1.fc33.x86_64 version: |- slirp4netns version 1.1.8 commit: d361001f495417b880f20329121e3aa431a8f90f libslirp: 4.3.1 SLIRP_CONFIG_VERSION_MAX: 3 libseccomp: 2.5.0 swapFree: 0 swapTotal: 0 uptime: 730h 26m 11.3s (Approximately 30.42 days) registries: search:
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- docker.io store: configFile: /home/podman/.config/containers/storage.conf containerStore: number: 0 paused: 0 running: 0 stopped: 0 graphDriverName: vfs graphOptions: {} graphRoot: /home/podman/.local/share/containers/storage graphStatus: {} imageStore: number: 0 runRoot: /tmp/containers-user-1000/containers volumePath: /home/podman/.local/share/containers/storage/volumes version: APIVersion: 2.1.0 Built: 1607438270 BuiltTime: Tue Dec 8 15:37:50 2020 GitCommit: "" GoVersion: go1.15.5 OsArch: linux/amd64 Version: 2.2.1
Package info (e.g. output of rpm -q podman or apt list podman):
podman-2.2.1-1.fc33.x86_64
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/containers/podman/issues/8759, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3AOCFQ47GME4AMKHRGVCLSVG45LANCNFSM4U7HDYNA .
This is expected, we leave a process running that just holds the namespaces open. This helps us avoid race conditions where you are running lots of containers at the same time.
Thanks!
is there a way to control this behaviour? just encountered a bug in a systemd service which refused to restart because it thought this podman pause daemon thing was still alive
podman system migrate will kill it, I believe.
@giuseppe thoughts on allowing podman to automatically kill it when it exits?
is there a way to control this behaviour? just encountered a bug in a systemd service which refused to restart because it thought this podman pause daemon thing was still alive
@asottile the podman pause process should be living in its own scope. Do you have a reproducer? What version are you using?
I can try and get you a more isolated / reproducible example in the morning -- here's an example from a nonprod host:
$ podman version
Version: 2.2.1
API Version: 2.1.0
Go Version: go1.15.2
Built: Thu Jan 1 00:00:00 1970
OS/Arch: linux/amd64
$ systemctl status pre-commit-ci-worker.service
● pre-commit-ci-worker.service - pre-commit-ci queue worker
Loaded: loaded (/etc/systemd/system/pre-commit-ci-worker.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2021-01-06 21:34:50 UTC; 4 days ago
Main PID: 8332 (python3)
Tasks: 2 (limit: 1160)
Memory: 69.8M
CGroup: /system.slice/pre-commit-ci-worker.service
├─8332 /opt/pre-commit-ci/venv/bin/python3 -u -m runner.worker
└─8447 podman
<< snipped output>>
$ sudo kill -INT 8332 # simulate a termination
$ systemctl status pre-commit-ci-worker.service
● pre-commit-ci-worker.service - pre-commit-ci queue worker
Loaded: loaded (/etc/systemd/system/pre-commit-ci-worker.service; enabled; vendor preset: enabled)
Active: deactivating (stop-sigterm) (Result: exit-code) since Mon 2021-01-11 09:45:06 UTC; 4s ago
Process: 8332 ExecStart=/opt/pre-commit-ci/venv/bin/python3 -u -m runner.worker (code=exited, status=1/FAILURE)
Main PID: 8332 (code=exited, status=1/FAILURE)
Tasks: 1 (limit: 1160)
Memory: 30.2M
CGroup: /system.slice/pre-commit-ci-worker.service
└─8447 podman
I guess the part I'm most confused on is I thought podman was supposed to be daemonless?
Well it is, but it needs a process to monitor the container. The daemonless part is that no two containers are related. IE, you can start a container from scratch without talking to a daemon. When the container is running there are several processes running. For example in rootless mode, there is the conmon process for monitoring the pid1 of the container, there is slirp4netns which is setting up its network, fuse-overlayfs which is setting up its mounts Then there are the processes within the container. Of course podman executeable is also running unless you run the container in --daemon mode.
When the container exits all of thes support processes go away.
When the container exits all of thes support processes go away.
that doesn't appear to be the case, even a podman ps
seems to spawn a lingering daemon:
$ ps -ef | grep '[p]odman'
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$ ps -ef | grep '[p]odman'
asottile 16208 2007 1 09:06 ? 00:00:00 podman
That is the pause process. It's not really a daemon, because it does no processing (once launched, it immediately goes to sleep and takes no further action). The pause process keeps alive the rootless user namespace for other rootless Podman processes - subsequent invocations of rootless Podman by that user will join that process's namespaces.
The pause process can be safely killed once all containers are stopped (if done when containers are still running, there is a risk that pods and other shared namespace containers will not work properly), but Podman can't easily do it ourselves - there is a serious race condition if a fresh Podman process launches as we are in the process of stopping the pause process that we have not figured out how to address.
is there a consistent way to kill that process? can I configure podman such that it is killed by podman upon exit if I can guarantee that I don't spawn containers in parallel?
I don't believe we expose the PID anywhere that's easy to access - @giuseppe @rhatdan Should we consider including it in podman info
?
We should also discuss methods for automatically shutting the process down at the cabal meeting - there has to be a locking solution we can use to guarantee we don't race.
Here's the hacky workaround I'm using such that the lingering process is not part of my main service
[Unit]
Description = hack for https://github.com/containers/podman/issues/8759
[Service]
User = whateveruser
Group = whateveruser
Type = notify
NotifyAccess = all
ExecStart = bash -c 'podman ps && systemd-notify --ready && exec sleep infinity'
[Install]
WantedBy=multi-user.target
and then my other service uses
[Unit]
Description = ...
Wants = dummy-podman.service
After = dummy-podman.service
# ...
should this issue be reopened until there's a better solution (or is there somewhere I can track that's better than this issue?)
I think we always move the pause process in its own cgroup.
Can you run with debug messages and check if you have some messages like Failed to add pause process to systemd sandbox cgroup
?
I'm not super experienced with the tech involved here -- is that an option to podman? the systemd service?
it is an option to podman:
podman --log-level debug ...
here's the last few lines:
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=info msg="[graphdriver] using prior storage driver: overlay"
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=debug msg="Initializing event backend journald"
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=debug msg="using runtime \"/usr/lib/cri-o-runc/sbin/runc\""
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=debug msg="using runtime \"/usr/bin/crun\""
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=warning msg="Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument"
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=info msg="Setting parallel job count to 4"
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=warning msg="Failed to add podman to systemd sandbox cgroup: exec: \"dbus-launch\": executable file not found in $PATH"
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=debug msg="Called ps.PersistentPostRunE(podman --log-level debug ps)"
Feb 03 07:47:16 ip-172-31-67-194 bash[1607]: time="2021-02-03T07:47:16Z" level=debug msg="Failed to add pause process to systemd sandbox cgroup: <nil>"
Feb 03 07:47:16 ip-172-31-67-194 systemd[1]: Started hack for https://github.com/containers/podman/issues/8759.
Feb 03 07:47:16 ip-172-31-67-194 bash[1623]: time="2021-02-03T07:47:16Z" level=warning msg="Failed to add podman to systemd sandbox cgroup: exec: \"dbus-launch\": executable file not found in $PATH"
the dbus-launch program is missing on your system. I think that is the reason why it is not moved to its own scope
dbus-launch
is provided by dbus-x11
on ubuntu -- I don't think I should need X11 to run podman
even with that installed it still does not work:
Feb 03 08:06:01 ip-172-31-67-194 bash[2227]: time="2021-02-03T08:06:01Z" level=warning msg="Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument"
Feb 03 08:06:01 ip-172-31-67-194 bash[2227]: time="2021-02-03T08:06:01Z" level=info msg="Setting parallel job count to 4"
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2243]: [session uid=0 pid=2241] AppArmor D-Bus mediation is enabled
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2243]: Cannot setup inotify for '/root/.local/share/dbus-1/services'; error 'Permission denied'
Feb 03 08:06:01 ip-172-31-67-194 bash[2227]: time="2021-02-03T08:06:01Z" level=warning msg="Failed to add podman to systemd sandbox cgroup: dbus: authentication failed"
Feb 03 08:06:01 ip-172-31-67-194 bash[2227]: CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Feb 03 08:06:01 ip-172-31-67-194 bash[2227]: time="2021-02-03T08:06:01Z" level=debug msg="Called ps.PersistentPostRunE(podman --log-level debug ps)"
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2248]: [session uid=2000 pid=2246] AppArmor D-Bus mediation is enabled
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2248]: [session uid=2000 pid=2246] Activating service name='org.freedesktop.systemd1' requested by ':1.0' (uid=2000 pid=2221 comm="podman --log-level debug ps " label="unconfined")
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2248]: [session uid=2000 pid=2246] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2248]: [session uid=2000 pid=2246] Activating service name='org.freedesktop.systemd1' requested by ':1.0' (uid=2000 pid=2221 comm="podman --log-level debug ps " label="unconfined")
Feb 03 08:06:01 ip-172-31-67-194 dbus-daemon[2248]: [session uid=2000 pid=2246] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1
Feb 03 08:06:01 ip-172-31-67-194 bash[2221]: time="2021-02-03T08:06:01Z" level=debug msg="Failed to add pause process to systemd sandbox cgroup: <nil>"
Feb 03 08:06:01 ip-172-31-67-194 systemd[1]: Started hack for https://github.com/containers/podman/issues/8759.
/kind bug
Description
When I run a simple podman command inside a container (i.e. no systemd), a podman sub-process is left running (sleeping).
(this is probably not a bug as such, but just surprising behavior)
Steps to reproduce the issue:
ps auxww
to show live processes.podman images
(process 191 in the below output)ps auxww
(shows podman process 204 in the below output)podman images
command is run, process 204 remains (does not spawn a new lingering process)Describe the results you received:
It looks like this in the console:
Describe the results you expected:
I would expect no lingering sub-processes from podman, after it completes the list.
I wonder if it is related to the "systemd lingering" support, or maybe the server mode? It does not appear related to the API service, as this needs to be started explicitly.
Since the second invocation does not create a new subprocess, I assume this is something intentional. But then it should maybe be mention in the docs/trouble shooting guide. And/or be controllable with an option.
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):