containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.15k stars 2.36k forks source link

kube play, in parallel, leaves stray untagged pause images #23292

Open edsantiago opened 2 months ago

edsantiago commented 2 months ago

Initial setup:

$ bin/podman images
REPOSITORY                    TAG                   IMAGE ID      CREATED         SIZE
quay.io/libpod/testimage      20240123              1f6acd4c4a1d  5 months ago    11.8 MB
quay.io/libpod/systemd-image  20240124              9479ce2eaa2d  5 months ago    149 MB

Create three kube files, and run three kubes in parallel:

$ for i in 1 2 3;do printf "apiVersion: v1\nkind: Pod\nmetadata:\n  name: foo$i\nspec:\n  containers:\n  - command:\n    - top\n    image: quay.io/libpod/testimage:20240123\n    name: fooctr$i\nstatus: {}\n" > foo$i.yaml;done

$ bin/podman kube play foo1.yaml & bin/podman kube play foo2.yaml & bin/podman kube play foo3.yaml
[2] 3587033
[3] 3587034
Pod:
516b243e9ee83dc8eee7e2273f6a4a1c8fb1c97a6bed46dbb00d063d7b649782
Container:
a7d3e67ef42012356dd22aaadd4e4044c77ca199974ca5522bf19af472f28f64
Pod:
773b0cdedc4ea0a7b6a7c5a6602f3b055380e2d4f7678e7a225c7f1d0213512a
Container:
5877a323b87695e85f65b2220d8009f853cb46c977c9ae9a164211e87a65902a
[2]  - done       bin/podman kube play foo1.yaml
Pod:
7aa9abb628f9c337936fb2074253b41ac8b5da61abe1ce0d6ac215128e29d2cf
Container:
47d4a47787470c87d6a73899068e9f85c660aa56c501470a394ea92a8143d4e5
[3]  + done       bin/podman kube play foo2.yaml

(Apparently) all three tried to build a pause image, one got tagged, the other two did not:

$ bin/podman images
REPOSITORY                    TAG                   IMAGE ID      CREATED        SIZE                                                                           
<none>                        <none>                0d5444c11909  5 seconds ago  742 kB       <<< BAD
<none>                        <none>                f95539868d11  5 seconds ago  742 kB       <<< BAD
localhost/podman-pause        5.2.0-dev-1721130071  451008d17bce  5 seconds ago  742 kB
quay.io/libpod/testimage      20240123              1f6acd4c4a1d  5 months ago   11.8 MB
quay.io/libpod/systemd-image  20240124              9479ce2eaa2d  5 months ago   149 MB

One of those images is in use by one of the containers, and can't be deleted until the pod stops. The other is not in use and can be rmi'ed. This confuses me because buildPauseImage() returns an image name, not ID. But I'm not going to lose sleep over that.

Relevant code seems to be https://github.com/containers/podman/blob/main/pkg/specgen/generate/pause_image.go . I can't think of any way to fix this that doesn't involve locks, yuk. Hope y'all have better solutions.

Luap99 commented 2 months ago

Yeah the builds happen in parallel as the image does not yet exists. But then a tag can only pint to one image so the last build image gets the name and then latter in the code will be used as it passes around the final name not the image id.

Luap99 commented 2 months ago

As of how to fix I really hate the pause image. I really wanted to just use a overlay mount on the rootfs for catatonit instead os this "useless" extra image no user really cares about and that gets rebuild for each podman version

Luap99 commented 2 months ago

As of how to fix I really hate the pause image. I really wanted to just use a overlay mount on the rootfs for catatonit instead os this "useless" extra image no user really cares about and that gets rebuild for each podman version

i.e see last comments on the PR https://github.com/containers/podman/pull/11956#issuecomment-952770024

There were issues with that back then but maybe it is worth to try that approach again?

Luap99 commented 2 months ago

I did a quick experiment: https://github.com/containers/podman/commit/16f67febbcf1bad710acb5f579ed5852f4866227

This does work (of course needs proper code cleanup). However I am unsure if this is the right way to go about it. There are several things to consider: If the host binary moves or is removed the infra is permanently broken. The current copy catatonit into the pause image seems to work better in that regard as it is a full copy.

github-actions[bot] commented 1 month ago

A friendly reminder that this issue had no activity for 30 days.