Open gregewing opened 10 months ago
If I build the image using 'docker build' then the process seems to leave behind some changes to how docker exposes cgroups to containers.
What changes are these? What is the difference between the created images and were the changes made by the builder or a process that you ran during the build?
There are no apparent changes to the image, the changes appear to be in the docker instance on the host, something to do with the way access to cgroups is passed through to the running and any future containers. I dont have specifics, only the apparent difference that I describe in my initial post.
I'm beginning to wonder if perhaps its something to do with apparmor, but I have tested this and removing all apparmor policies (aa-teardown) did nothing to allow the vagrant box image to start, then running a 'docker build' immediately allows the vagrant box image to start up correctly.
From: Tõnis Tiigi @.> Sent: 10 January 2024 18:30 To: docker/buildx @.> Cc: gregewing @.>; Author @.> Subject: Re: [docker/buildx] build and buildx handling cgroups differently (Issue #2183)
If I build the image using 'docker build' then the process seems to leave behind some changes to how docker exposes cgroups to containers.
What changes are these? What is the difference between the created images and were the changes made by the builder or a process that you ran during the build?
— Reply to this email directly, view it on GitHubhttps://github.com/docker/buildx/issues/2183#issuecomment-1885400960, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHSEGJVAQRQRRHPC5L4TOW3YN3M3LAVCNFSM6AAAAABBUZVV7GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBVGQYDAOJWGA. You are receiving this because you authored the thread.Message ID: @.***>
There are no apparent changes to the image, the changes appear to be in the docker instance on the host,
Are you saying
Yes and No.
I run the image on a couple of different hosts, but for simplicity let's focus on a single host.
I build the image, and I run it on the same host. I reboot the host (because its a workstation) and when I do, it 'resets' to what I assume is the correct configuration.
After the host is rebooted, and the configurations state is reset, if I run the vagrant image, the image itself starts up correctly but the process if starting up vagrant box images inside the docker container fails. It will continue to fail, until I run a 'docker build' in a separate terminal window ( with or without the image running ). It can be any 'docker build' activity for any Dockerfile, but not a 'docker buildx'.
I can continue to create and delete containers based on local or remotely pulled images with absolute success starting up vagrant box images within the containers as many times as I want until I reboot the host.
If I use 'docker buildx' instead of 'docker build' then things are much more stable. I am unable to start any of the vagrant box images I try inside the docker containers I create. This is a good thing, because it's consistent. As a result, I have realised that I need to include "--cgroupns host" in the docker run command in order to have the container work properly when i want to use it to start vagrant box images inside it. So I have a workaround, but I wanted to bring the odd behaviour to your attention.
It does not make a difference if I use the locally cached image, before or after it is pushed to the docker hub registry, or if I use a version pulled from the docker hub registry. Incidentally, the image is in hub.docker.com registry at gregewing/windows_on_linux:latest. as a quick test image I run vagrant init generic/alpine
followed by vagrant up
because the image is small, but the plan is to use this to run windows server instances for dev/test purposes.
In the spirt of full disclosure, when I run this image on a different host I had issues with cgroups again, but this presented differently and I was able to resolve it by clearing apparmor profiles on that host. I tried the same on my workstation ( the one host above) and it had no impact whatsoever. I think that is a red herring.
Just wondering if there is more information required for this report ?
Contributing guidelines
I've found a bug and checked that ...
Description
I'm building a container image that relies on cgroups being managed from within the container. The container has libvirt, qemu-kvm and vagrant installed, and vagrant uses cgroups to apply resource constraints to the vagrant images nested within the docker container.
If I build the image using 'docker build' then the process seems to leave behind some changes to how docker exposes cgroups to containers. I am able to successfully run the container and start vagrant box images within the container with no problem. I can reliably and consistently work around the problem scenario by including "--cgroupns host" in the docker run command, or by running the docker build command again, which I would not expect consumers of the container to be required to do.
If I build it using 'docker buildx' then I don't see the same issue. ( is build formally deprecated in favour of buildx ? ) I get what I think is the correct behaviour and the container always fails. I can reliably and consistently work around this by including "--cgroupns host" in the docker run command.
I noticed this when I was having problems running the container image (which behaves identically regardless of build method) on a host which had not been used to build any images. Similarly if I reboot the host that was used to build the image, then run the image without building it first, then I have problems.
Further I noticed that it does not matter which container I build with 'docker build'. If I build a completely unrelated image, then the same changes to how cgroups are exposed through docker and I am able to start a vagrant box image within my vagrant container.
Expected behaviour
'docker build' should not leave cgroups available to images run after a build is run. perhaps a tidy-up step is missed somewhere? I expect the behaviour to be the same as building container images with 'docker buildx'
Actual behaviour
When building (any) image with 'docker build' there seems to be some alteration to how cgroups are exposed through the docker daemon to containers, be those currently running, or run at any point in the future. This is true in my environment for build activities with 'docker build' and not 'docker buildx'
Buildx version
github.com/docker/buildx 0.11.2 0.11.2-0ubuntu1~22.04.1
Docker info
Builders list
Configuration
Comnands to build as follows:
docker build: /usr/bin/docker image build -t gregewing/windows_on_linux:latest -f Dockerfile . /usr/bin/docker push gregewing/windows_on_linux:latest
docker buildx: /usr/bin/docker buildx build -t gregewing/windows_on_linux:latest /mnt/RAID/myDockers/vagrant/win_any/ --push
Build logs
Additional info
Like I mentioned above, it does not seem to matter which image I build using 'docker build', if I have a running vagrant container with a vagrant box downloaded and configured and I try to bring up that vagrant box, it will fail every time. if I then separately run a docker build command to build a small unrelated container from a different Dockerfile, then try bringing up the vagrant box again, it will work.
I'm prepared for this to be something to do with cgroup drivers, but I have not tried changing the cgroup driver as this seems to be a default setting.
Furthermore, as I mentioned earlier, if I add "--cgroupns host" to the docker run line, I am able to start the nested vagrant image every time, regardless of whether or not a 'docker build' activity has been performed on the host.