Open ojab opened 3 years ago
Happens with 0.7.1 as well: github.com/docker/buildx v0.7.1-docker 05846896d149da05f3d6fd1e7770da187b52a247 Docker version 20.10.12, build e91ed57
Have a similar issue when doing a buildx bake. It hangs on random commands. Already investigated 3 days but it is very hard to track it down.
At first I thought it is something with the hcl files, because when i changed something there and some things in the dockerfile it worked sometimes. But then I changed something again and the previous version has stopped working although it worked a couple of minutes ago.
So it works/works not based on changes you do in the files but it doesn't matter what you change it is more related to which files you touch, the order of changes of the files. We have split the configuration to multiple files, but I could also reproduce it with a single hcl file and multiple dockerfiles.
So it is probably related to some internal scheduling based on what files are changed (config, dockerfile)? Could that be?
I also found out that it is very likely related to some extend to the contexts
option int the .hcl files. We use to link to other targets to implement kind of a build order. The problematic (but for our case necessary) Dockerfile commands is the COPY --from
command.
FROM bar AS foo
...
COPY --from=foo ...
The overall demo project looks like this:
# docker-bake.hcl
target "inh" {
context = "."
cache-from = [
"type=local,src=/Users/ceelian/tmp/buildx_cache"
]
cache-to = ["type=local,dest=/Users/ceelian/tmp/buildx_cache"]
}
target "base" {
inherits = ["inh"]
dockerfile = "baseapp.Dockerfile"
tags = ["example.com/base:latest"]
}
target "third" {
inherits = ["inh"]
dockerfile = "thirdapp.Dockerfile"
tags = ["example.com/third:latest"]
}
target "another" {
inherits = ["inh"]
contexts = {
"thirdapp" = "target:third",
}
dockerfile = "anotherapp.Dockerfile"
tags = ["example.com/another:latest"]
}
target "app" {
inherits = ["inh"]
dockerfile = "Dockerfile"
contexts = {
"baseapp" = "target:base",
"anotherapp" = "target:another",
}
tags = ["example.com/testapp:latest"]
}
# anotherapp.Dockerfile
FROM alpine:3.15.3
RUN ["touch", "another.txt"]
# baseapp.Dockerfile
FROM python:3.10.4-alpine3.15
RUN ["touch", "hello.txt"]
# thirdapp.Dockerfile
FROM alpine:3.15.3
RUN ["touch", "third.txt"]
# Dockerfile
FROM baseapp
FROM anotherapp
FROM python:3.8.6-alpine
COPY --from=0 /hello.txt /hello.txt
COPY --from=1 /another.txt /another.txt
RUN echo "Hello world"
# The commands to start the build
$ docker buildx create --name mybuilder --node mybuilder0 \
--platform linux/arm64,linux/riscv64,linux/ppc64le,linux/s390x,linux/mips64le,linux/mips64,linux/arm/v7,linux/arm/v6,linux/amd64 \
--driver-opt env.BUILDKIT_STEP_LOG_MAX_SIZE=10000000 --driver-opt env.BUILDKIT_STEP_LOG_MAX_SPEED=10000000
$ docker buildx bake --load -f docker-bake.hcl --builder mybuilder app
I also tried --no-cache
flags and even introduced a ARG CACHEBUST
based on the idea of Sebastion on https://www.freecodecamp.org/news/docker-cache-tutorial/ but in the end I couldn't reliable reproduce the error.
The next step is trying to workaround the issue by trying to combine the individual dependent images in a single multistage image hoping that without the contexts
the copy --from
will not get the build process hang indefinitely.
If you need any logs, please just tell me how to get them and I can add them here.
My docker info:
docker info
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.8.1)
compose: Docker Compose (Docker Inc., v2.3.3)
scan: Docker Scan (Docker Inc., v0.17.0)
Server:
Containers: 46
Running: 41
Paused: 0
Stopped: 5
Images: 99
Server Version: 20.10.13
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
runc version: v1.0.3-0-gf46b6ba
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.10.104-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 5
Total Memory: 31.31GiB
Name: docker-desktop
ID: 7UQR:JSMW:UI4R:HPKG:7VPM:T5TT:UKJO:UNX3:DBCT:W5XF:CL5L:VOIO
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5000
127.0.0.0/8
Live Restore Enabled: false
If anyone faces the same issue, I "solved" it by removing all "contexts" from the bake-config.hcl files.
I recombined all previously separated dockerfiles to a few independent multi-stage dockerfiles. That way I could remove the "dependencies" from the bake-config.hcl files. So I didn't need to "contexts" section anymore and could remove all of them and now the build seems quite reliable locally on amd64, arm64 and on the CI system.
This doesn't solve the issue but it is a workaround.
I am facing the same issue and I want to find a trick to properly identify these leftovers and prune them regularly. Any ideas?
# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2ee4bc562996 moby/buildkit:buildx-stable-1 "buildkitd --allow-i…" 2 days ago Up 2 days buildx_buildkit_builder-885de0ba-d701-4918-a9a8-ce82331cebc30
6505805d2d78 moby/buildkit:buildx-stable-1 "buildkitd --allow-i…" 5 days ago Up 5 days buildx_buildkit_builder-b0bcf721-0454-4a21-836f-6f03a4f4efb80
e6dd9f70086b 32aa1a493317 "buildkitd --allow-i…" 9 days ago Up 8 days buildx_buildkit_builder-dded1ecb-4d51-4beb-944d-f8bf2e39653e0
475bfabb6ee3 32aa1a493317 "buildkitd --allow-i…" 9 days ago Up 8 days buildx_buildkit_builder-c219ed3c-04a9-4417-b6ed-5474e43da7bc0
buildx-0.5.1,
moby/buildkit:buildx-stable-1 (be8e8392f56c)
,Docker version 20.10.5, build 55c4c88
on linux x86_64.docker buildx build --platform=local -o . git://github.com/docker/buildx
docker exec -ti buildx_buildkit_builder-builder0 kill -s QUIT 1
wherebuildx_buildkit_builder-builder0
is the name of buildkit containerdocker buildx build
hangs indefinitely