containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.71k stars 2.41k forks source link

Feature Request - Pin Podman Volumes to be Ignored when Pruning #23217

Open Nitrousoxide opened 4 months ago

Nitrousoxide commented 4 months ago

Feature request description

Podman volume prune will currently prune all volumes which aren't currently being used by extant containers. However a way to change the default prune behavior for podman volume prune for known important volumes which might be temporarily orphaned due to their parent container being gone for maintenance would be a nice addition. Something like ostree's "pin" feature for the deployment which marks a volume as exempt from any prune commands would be useful.

I know I personally rarely use podman volumes for fear of accidentally blowing them up with a prune, opting instead to define paths to mount in my containers. I would be much more likely to use podman volumes if I could change the default prune behavior for ones I care about.

Pinning would also allow you to safely use podman volumes to create persistent storage for containers which don't need to be spun up and running all the time. For instance for containers you might want to use the -rm flag on with the run command due to them being one-off runs that don't need to persist as a container but which still might benefit from some persistent storage for one reason or another.

Suggest potential solution

Being able to perform an action like podman volume pin $VOLUMEID for existing volumes, and being able to pin volumes at creation through a pin flag for podman volume create --pin would make this a simple task for the end user. Being able to toggle a pin flag on and off through podman volume pin $VOLUMEID would also allow them to be destroyed later if they are no longer needed.

I could also see the addition of an affirmative flag like podman volume prune --include-pins to ignore pinning when pruning.

Whether a volume is pinned may need to be included in the podman volume list command

Have you considered any alternatives?

You can get partway there through the use of volume labels, as you can exclude labels from pruning, and careful application of labels to the volumes you care about can ensure they are ignored with the prune command. However, they are still one inadvertent prune command without the label exclusion (or a misspelled label) away from destruction. Being able to change the default prune behavior for volumes would avoid this.

Another possible alternative would be allowing the user to add flags to ignore when pruning to the /etc/containers/storage.conf file. This would allow them to change the default behavior for prune, without needing to add this whole workflow suggested above. However, implementing it through a config file change would be unintuitive for most users and would likely lead to little utilization of such a feature.

Additional context

No response

Luap99 commented 4 months ago

We were talking about "pinning" in the context of container images updates in bootc recently so this is not the first time the concept of pinning comes up even though you are talking about volumes here. I guess thinking further one could make a similar case for containers and maybe other resources.

So maybe we should aim for something more generic here.

cc @baude @mheon @rhatdan @vrothberg @cgwalters

vrothberg commented 4 months ago

Thanks for the ping, @Luap99!

Having a more generic way of "pinning" objects seems like a good idea. Containers, Pods, Images, Volumes, Networks, etc. Looking at the different workflows, I may be worth looking into a static approach via config, and a dynamic approach via the CLI?

mheon commented 4 months ago

I really like this idea. One of the remaining questions in my mind would be around deliberate removal. Pinning should obviously prevent removal during pruning, but what about podman rm or similar? Personally, I'd make it so that rm --force would still remove the container/image/volume/pod, while a regular removal would not.

cgwalters commented 4 months ago

temporarily orphaned due to their parent container being gone for maintenance would be a nice addition.

I would elaborate and expand on the intersection of this and https://github.com/containers/bootc/issues/128 (and all the related threads like https://github.com/containers/podman/pull/23065 and https://github.com/containers/podman/issues/22785 etc.)

A default expectation with logically bound bootc images is that the container configuration should have the .container files as the source of truth.

We were focusing on the underlying images, but it's equally important to ensure that any referenced volumes are pinned iff¹ there is a container referencing them.

The most recent thinking on the bootc side has changed - we are planning to have bootc use a "symlink directory" to .container files (or .image files). In this it would become natural for the bootc design here to expand to also handle volumes - or in the fully general case of course, pods and all API objects they may reference.

Personally, I'd make it so that rm --force would still remove the container/image/volume/pod, while a regular removal would not.

For a container referencing a volume, would such a thing forcibly kill/remove the container or just the volume (and if the latter, how would that work?)

¹ https://simple.wikipedia.org/wiki/If_and_only_if

mheon commented 4 months ago

Libpod does not allow dangling references; any operation that removes an object with dependencies (e.g. an image with associated containers, a pod with containers in it, a volume with containers using it) must remove associated dependents as well. This is the current behavior of our --force flag (which means that podman rm --force, an operation that normally only affects containers, can remove an entire pod, and all containers in that pod, if you call it on the pod's infra container).

mheon commented 4 months ago

On the .container bit: we've kept Quadlet rather separate from Libpod in the current implementation, so adding enough knowledge of the parsing for us to know that a .container file that has not yet created a container is going to use a specific volume, is going to be a real technical problem. I suppose we could handle this as a Quadlet problem, pin the volume upon reading the .container and unpin once it's unloaded from systemd?

github-actions[bot] commented 3 months ago

A friendly reminder that this issue had no activity for 30 days.