docker / compose

Define and run multi-container applications with Docker
https://docs.docker.com/compose/
Apache License 2.0
34k stars 5.23k forks source link

Docker Compose V2 doesn't seem to allow you to stop running builds in parallel, stopping solution from building. #9091

Closed Jade-Codes closed 1 year ago

Jade-Codes commented 2 years ago

Description I have a docker-compose file that runs 10 .Net Microservices, after upgrading to Docker Desktop 4.3.2, Docker Engine v20.10.11, when I try to do 'docker-compose up --build' whilst Docker Compose v2 is enabled, the following issue happens: image

When I run 'docker-compose disable-v2', the issue is resolved and the builds run sequentially.

Steps to reproduce the issue:

  1. Have Docker Compose v2 enabled
  2. Have a docker-compose file with 10 .net microservices that reference Dockerfiles.
  3. Run 'docker-compose up --build'

Describe the results you received: Build run in parallel and error: image

Describe the results you expected: Build runs in either parallel or sequential and pass.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker compose version:

Docker Compose version v2.2.1

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.7.1)
  compose: Docker Compose (Docker Inc., v2.2.1)
  scan: Docker Scan (Docker Inc., v0.14.0)

Server:
 Containers: 26
  Running: 24
  Paused: 0
  Stopped: 2
 Images: 92
 Server Version: 20.10.11
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.16.3-microsoft-standard-WSL2
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 7.69GiB
 Name: docker-desktop
 ID: MLBF:5YQ2:UIR6:HHAK:XJED:UDK7:6SHN:3FW6:ENYR:PQRH:UZMU:RQ7E
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional environment details:

ndeloof commented 2 years ago

docker compose v2 uses BuildKit by default, you can force use of the legacy "classic" builder setting DOCKER_BUILDKIT=0.

Can you please check you can build this Dockerfile by DOCKER_BUILDKIT=1 docker build ...? If you get the same issue, then please report issue to https://github.com/docker/buildx

WolfspiritM commented 2 years ago

We are not able to switch from docker-compose 1 to docker-compose 2. I really wonder why the parallel build issues are not addressed at all. Using the same command to build our solution with docker-compose v2 causes our build server to crash completly cause it's out of memory and CPU and starts to kill random processes cause of OOM. Setting buildkit to 0 causes the build to take 30 minutes instead of 15 with docker-compose v1.

I made a buildx issue 2 years ago: https://github.com/docker/buildx/issues/359

There is no flag to specify how many parallel jobs are run...there is no flag to specify memory and cpu...

Are we the only ones using docker-compose with more then just a simple TODO App?

SchroterQuentin commented 2 years ago

I have a different issue but really similar to yours, when I run docker compose build all the dependencies of my services are downloaded at the same moment and some service get their dependecies, others just timeout ... I don't understand why there is not a simple flag telling "build image one after another"

rowanmoul commented 1 year ago

We are also having this issue at my org. We have 9 dotnet containers that rebuild at the same time when running docker compose up --build -d and we suspect the issue is that something is getting rate-limited due to the burst of network requests to nuget.org and also to our internal registry, which requires authentication, meaning 9 auth requests are happening in a very short period of time. It could also be a network driver or some other aspect of the networking chain that is choking on the number of requests generated. When we add significantly longer timeouts (30 seconds), the problem resolves in most cases (not all). We would rather just turn off parallel builds, or be able to limit it to 2-3 at a time.

evoyy commented 1 year ago

Same issue here. Getting the same failed to solve: executor failed running ... error when building 5 containers at a time. I have to work around this by building each container in turn.

Docker Compose (Docker Inc., v2.12.2)

gmussi commented 1 year ago

Another one with the same issue. It feels dumb to run multiple build commands, one for each image, due to the missing flag. Can anybody from the team address this, please?

fuzzybair commented 1 year ago

I have the same issue with our system that has a little over 30 .net containers (and growing). To answer the question from @ndeloof above, at least in our case building a container with build kit works, in fact we build all of our containers with buildkit and compose v1.

How does compose v2 decide how many builds to start at once? We did see issues with pulls from our registries as @rowanmoul described but we solved that with buildkit RUN --mount=type=cache. However as far as I can tell compose just loops over the list of services and spawns a build for each, and building 30+ applications at once is bound to fail on any workstation.

If you are running on a Mac/Linux computer you may be able to set something in /etc/buildkit/buildkitd.toml to have it control the max-parallelism option, but for those of us on windows that option is not exposed. see https://github.com/moby/buildkit/issues/2906#issuecomment-1154565987.

ndeloof commented 1 year ago

Compose v2 indeed fully delegates service image build to buildkit. You can limit parallelism on the builder using https://github.com/docker/buildx/blob/master/docs/guides/resource-limiting.md#max-parallelism.

fuzzybair commented 1 year ago

@ndeloof thanks for the quick response but as I indicated above for those of us on windows i.e. Docker Desktop that option is not exposed. see https://github.com/moby/buildkit/issues/2906#issuecomment-1154565987. In short there is no configurable buildkitd.toml file. The docker team seems to be ignoring/missing that fact, consistently stating buildkitd.toml as a solution to this issue. I checked out the code to see if I could implement a solution but there are many barriers to overcome for example when following the developer setup guide

  1. You have to reinstall docker desktop using Hyper-V or you get \\.\pipe\docker_engine_windows: the system cannot find the file specified.
  2. The base image microsoft/windowsservercore used in moby/moby/Dockerfile.windows is not a valid tag.
  3. The IDE used in the setup requires a license from JetBrains, I would think adding an option for using a well supported open source tool like Microsoft's Visual Studio Code would be a good addition to the tool chain.

Anyway after spending a little over a day trying to get a simple build working I gave up and left my needs in the hands of those working on the project in hopes they recognize the gap in what is provided and the suggested solution.

ndeloof commented 1 year ago

Yes, we need to offer a better way to manage this concurrent build issue. I just shared the link for documentation.

fuzzybair commented 1 year ago

@ndeloof I do appreciate your help but maybe I am missing something. The documentation you linked https://github.com/docker/buildx/blob/master/docs/guides/resource-limiting.md#max-parallelism does not have any explanation on how to do this on windows. It seems very Linux oriented in particular it talks about configuring the Linux daemon using /etc/buildkitd.toml so I ask you the same question that was asked here https://github.com/moby/buildkit/issues/2906 "Where is buildkitd.toml when running on WSL2?" Or are you suggesting that I need to run docker buildx create? if so how do I tell docker compose v2 to use that builder and not the builder created and managed by Docker Desktop, I see no way in C:\ProgramData\docker\config\daemon.json to turn it off https://docs.docker.com/config/daemon/

ndeloof commented 1 year ago

I can't tell about configuration with WSL2, buildkit issue #2906 is the right place to get an answer In the meantime, using a new builder (docker buildx create then docker buildx use) is a simple way to run a new builder where you can define buildkit configuration

fuzzybair commented 1 year ago

I created another builder using docker buildx create --use --name my-builder --driver docker-container --config C:\dev\config\buildkitd.toml the config I used still resulted in a stalled build (still running after an hour where v1 completes in under 20 min). I started with the example max-parallelism = 4 but when it seemed to be stalled I rebooted and I dropped max-parallelism to 1 assuming that would get me as close to a sequential build in compose v2 that compose v1 has. I was happy to see that the builder was still available after a reboot but as I needed to change configuration I ran docker buildx rm my-builder and re-created it with the above command.

I am ok going with a sequential build like compose v1 if needed because it does succeed, however I would like to take advantage of some of the newer features like mounting build secrets provided by compose v2. From what I can tell in the build output (I may be reading something wrong) the max-parallelism setting has no effect on the number of containers being concurrently built by compose. My guess is that max-parallelism only impacts the number of concurrent builds within a single image for example in our multi-stage docker files we have a we have a UI build and REST api build that happen in parallel and are combined in the final image with --copy-from

For reference here is my config and I have attached the first 2,000 lines (few seconds) of the output from docker compose -f docker-compose.ci.yml build --progress=plain > ../buildkit.log

# /etc/buildkitd.toml
[worker.oci]
  max-parallelism = 1

buildkit.log

fuzzybair commented 1 year ago

@ndeloof thanks for your work on this. However I did have a question on the usage I am assuming I would run something like docker compose build --parallel 5? is there a way to set a system wide default in an environment variable like COMPOSE_PARALLEL_LIMIT (may be related to https://github.com/docker/compose/issues/8226)

ndeloof commented 1 year ago

@fuzzybair good point. Added support for COMPOSE_PARALLEL_LIMIT in aa5cdf2 (#10133)

dm17 commented 4 months ago

you can force use of the legacy "classic" builder setting DOCKER_BUILDKIT=0

Does this still work? I'm not seeing in the docs any mention of it being deprecated and my new upgraded compose just started ignoring DOCKER_BUILDKIT=0 in .env and I'm not sure why.