jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.54k stars 388 forks source link

Remove imageCleaner.host? #1584

Open consideRatio opened 1 year ago

consideRatio commented 1 year ago

The binderhub helm chart's imageCleaner configuration relates to a DaemonSet that starts one pod per k8s node. Currently, one or two containers can be started. One for host and one for dind. This issue is meant to discuss the removal of the option to start one container for host.

Motivation

Related

minrk commented 1 year ago

What happens if we use host docker and not dind (or podman) for build? Does that mean we can't clean build images at all? I think k8s only cleans its own images (not sure), which would mean this would guarantee that a host's disk will fill up and unused builds never get deleted.

I think it doesn't make sense to run the host docker cleaner when dind is in use, but I think we need to clean our images on the host if we are running builds with the host docker (unless k8s also deletes images it's not responsible for).

I think ~everyone uses dind, so if we want to drop support for using host docker entirely, that's probably okay with me, but they should probably go together. I think it might be a problem for local testing, since you'll need a registry somewhere to get images from the build to the host, which you don't need with host docker, but image cleaning also isn't a big deal for local testing, I think.

sgaist commented 1 year ago

From what I understand from the k8s garbage collection documentation, reaching a certain (configurable) thresholds, kubelet will start removing unused images/containers/etc. from oldest to newest. These thresholds might be different from the ones set for the image-cleaner.

Since k8s 1.24 there is a new use case: docker is not the default anymore to handle cluster related stuff. Note that it's possible to have a mix of Docker and non-Docker enabled nodes in the same cluster.

This means that for a BinderHub deployment that would use the host Docker, we have now two options: 1) the administrator continues to use docker for k8s and thus the image-cleaner does exactly what the k8s documentation recommends not to 2) docker is used as image builder while the node uses something else like simple containerd to handle k8s workload.

In this latter case, the image-cleaner makes sense to keep and deploy as there won't be any garbage collection done through k8s.

Based on your last paragraph, I think its unlikely that will have that kind of scenario (host docker installed just for building images for BinderHub).

One thing we could do is to require explicit enabling of the host image-cleaner rather than the current explicit disabling. That would allow to keep the host cleaning support while doing nothing by default.

One thing that I may have missed from the current implementation but I don't think that the image-cleaner does BinderHub specific cleanups, or does it ?