Open cavedon opened 1 week ago
Hi @cavedon, thanks for trying Sysbox and reporting the issue.
Your observation is correct: when a Sysbox container stops, Sysbox transfers data from /var/lib/sysbox/docker/<uuid>
to the container's nominal rootfs on the host (usually /var/lib/docker/overlay2/uuid/...
). It uses rsync to do the transfer efficiently. The data it transfers is basically the contents of the container's /var/lib/docker
. It's trying to transfer that data to the container's rootfs, which is ideally where it should have been in the first place (but can't during container runtime as it results in overlayfs-on-overlayfs which does not work). This way, if one ever does a docker commit
to stop and capture the container state, the resulting container image will have the contents of /var/lib/docker
in it, which in turn results in the ability to preload container images into Sysbox containers (more about this in the Sysbox docs).
The problem you are seeing occurs typically when the container's /var/lib/docker
has a lot (GBs) of data. Is that the case?
The best way to work-around it is to set this flag in sysbox-mgr to true: disable-inner-image-preload=true
.
See here for how to do this.
Hope that helps!
Thank you! I will try disable-inner-image-preload=true
as I do not need image preload.
The problem you are seeing occurs typically when the container's /var/lib/docker has a lot (GBs) of data. Is that the case?
I have now 25 GB in /var/lib/docker, but I do not think there was much when the issue occurred he first time, I had just clean up the system to triage that...
I have sysbox-ce running Ubuntu jammy.
Sometimes I see a situation where the content of /var/lib/sysbox/docker is not cleaned up. For those containers I see 3 rsync processes stuck, and some errors in the docker logs about not being able to remove the container.
What I suspect is happening is that a request is done to stop the container. At that point sysbox starts an rsync from /var/lib/sysbox/docker/XXX to /var/lib/docker/overlay2/YYY (is that true? What is the purpose of that rsync?). But because of the size of /var/lib/sysbox/docker/XXX, the rsync takes too much time, docker gives up waiting and deletes /var/lib/docker/overlay2/YYY, which causes rsync to get stuck (instead of crashing, why?) and the content of /var/lib/sysbox/docker/XXX not to be deleted.
Here are the relevant logs: