containers / podman

Podman: A tool for managing OCI containers and pods.
Apache License 2.0
23.15k stars 2.36k forks source link

kube play, in parallel, leaves stray untagged pause images #23292

Open edsantiago opened 2 months ago

edsantiago commented 2 months ago

Initial setup:

$ bin/podman images
REPOSITORY                    TAG                   IMAGE ID      CREATED         SIZE      20240123              1f6acd4c4a1d  5 months ago    11.8 MB  20240124              9479ce2eaa2d  5 months ago    149 MB

Create three kube files, and run three kubes in parallel:

$ for i in 1 2 3;do printf "apiVersion: v1\nkind: Pod\nmetadata:\n  name: foo$i\nspec:\n  containers:\n  - command:\n    - top\n    image:\n    name: fooctr$i\nstatus: {}\n" > foo$i.yaml;done

$ bin/podman kube play foo1.yaml & bin/podman kube play foo2.yaml & bin/podman kube play foo3.yaml
[2] 3587033
[3] 3587034
[2]  - done       bin/podman kube play foo1.yaml
[3]  + done       bin/podman kube play foo2.yaml

(Apparently) all three tried to build a pause image, one got tagged, the other two did not:

$ bin/podman images
REPOSITORY                    TAG                   IMAGE ID      CREATED        SIZE                                                                           
<none>                        <none>                0d5444c11909  5 seconds ago  742 kB       <<< BAD
<none>                        <none>                f95539868d11  5 seconds ago  742 kB       <<< BAD
localhost/podman-pause        5.2.0-dev-1721130071  451008d17bce  5 seconds ago  742 kB      20240123              1f6acd4c4a1d  5 months ago   11.8 MB  20240124              9479ce2eaa2d  5 months ago   149 MB

One of those images is in use by one of the containers, and can't be deleted until the pod stops. The other is not in use and can be rmi'ed. This confuses me because buildPauseImage() returns an image name, not ID. But I'm not going to lose sleep over that.

Relevant code seems to be . I can't think of any way to fix this that doesn't involve locks, yuk. Hope y'all have better solutions.

Luap99 commented 2 months ago

Yeah the builds happen in parallel as the image does not yet exists. But then a tag can only pint to one image so the last build image gets the name and then latter in the code will be used as it passes around the final name not the image id.

Luap99 commented 2 months ago

As of how to fix I really hate the pause image. I really wanted to just use a overlay mount on the rootfs for catatonit instead os this "useless" extra image no user really cares about and that gets rebuild for each podman version

Luap99 commented 2 months ago

As of how to fix I really hate the pause image. I really wanted to just use a overlay mount on the rootfs for catatonit instead os this "useless" extra image no user really cares about and that gets rebuild for each podman version

i.e see last comments on the PR

There were issues with that back then but maybe it is worth to try that approach again?

Luap99 commented 2 months ago

I did a quick experiment:

This does work (of course needs proper code cleanup). However I am unsure if this is the right way to go about it. There are several things to consider: If the host binary moves or is removed the infra is permanently broken. The current copy catatonit into the pause image seems to work better in that regard as it is a full copy.

github-actions[bot] commented 1 month ago

A friendly reminder that this issue had no activity for 30 days.