docker / buildx

Docker CLI plugin for extended build capabilities with BuildKit
Apache License 2.0
3.45k stars 466 forks source link

network mode "custom_network" not supported by buildkit #175

Open MarcosMorelli opened 4 years ago

MarcosMorelli commented 4 years ago

Background: Running a simple integration test fails with network option:

docker network create custom_network docker run -d --network custom_network --name mongo mongo:3.6 docker buildx build --network custom_network --target=test .

Output: network mode "custom_network" not supported by buildkit

Still not supported? Code related: https://github.com/docker/buildx/blob/master/build/build.go#L462-L463

jorismak commented 8 months ago

I found a nice workaround, it's also relevant to any other frontend framework too, short workaround strategy is this:

Any workaround which involves 'finding out the IP of' is not acceptable, since that absolutely does not make it portable :). That works maybe for a local setup where you are the only one using the docker files and you know your host. In any kind of network setting or team setting, that is out of the window.

Also, can't be 'the way it is supposed to be'

Read #591 (comment) to see Buildkit supports none, host, default(sandboxed) as network mode. Supporting custom network names is currently not planned as it conflicts with the portability and security guarantees. and don't use any other custom networks with Buildkit, that's the new reality of docker build command.

The thing is that they are still recommending (and it's kind of needed with how it work) to disable the builtin bridge network as step 1 after installing Docker Engine. So you need to create custom networks and supply them to any container or build command that starts and needs internet or some sort of container-to-container network mode. Breaking the --network command from docker build (since it's passed blindly to buildx which uses a different network parameter all together) means breaking the build command.

I know a lot of people here are commenting about how it works with docker-compose, but it just basically breaks the default docker build command. How....

The workaround is to create your own builder, and apply a network to that. But I would much rather just see the 'buildx' enforcement being removed, because it seems completely useless to me if you are just building things locally to then push. But that's also not a good workaround if you often switch networks for your build commands (as you do in a docker-compose file with custom networks).

Any other solution I found is to use docker buildx ... but that way you lose the docker-compose features or merge strapi and next.js in one big monorepo with big single docker-compose file

In theory, this means you use docker-compose to build and then launch? Simply building in a separate step would work I guess. I don't use docker-compose here. Development is done locally and I'm the one making the images for deployment for the team, so I only work local and push to our registry.

If someone can tell me howto go back to the normal old 'docker build' command and just disable and ignore the whole buildx thing, yes please.

dm17 commented 8 months ago

If someone has a Hacker News account with street/interweb cred, then perhaps they can post this thread on there... It is often how such situations get attention from companies with loads of money whose sole purpose is supposedly to remedy such situations - rather than cause them.

jedevc commented 8 months ago

If someone has a Hacker News account with street/interweb cred, then perhaps they can post this thread on

Heya, so, I'm no longer at docker and so only really speak for myself here, but I do feel the need as a maintainer on this project to call this out.

This is an open source project. Contributions and contributors are welcome - any person in this thread is welcome to engage on how to fix this issue, and upstream any contributions. It's known to be a limitation, but demanding that the maintainers drop everything to work on this is just not great form, sorry. Just like you, the maintainers of this project are busy, having to deal with internal/external constraints on their time, and often other things take priority.

That said, if you're a paid user of docker and this is affecting you, and you need to see this resolved, then you should raise this through support/your account representative. They can escalate internally and help set engineering priority - this is generally true of any corporate-backed open source project.

Finally, essentially calling for a brigade of a github thread during the break between Christmas and the new year, a time which very often people take off from work and would not be able to respond is... not great. I sincerely hope no good open source citizen would do this, but if they had, it would have the potential to disrupt people's time with family and friends. Please don't do that.

Please let's move this thread back on track - sharing workarounds and discussing ways in which this could be worked on productively.

dm17 commented 8 months ago

I'm not surprised that someone with pronouns in their GitHub profile just created an entire victim narrative out of this. However, I am thankful for jedevc for pointing out that I should try pushing on a docker account rep for this issue.

Calling it a "brigade" is your narrative - not a fact. Especially when loads of stuff has recevied positive attention via this method, there is no rule against posting in a more general developer publication (which HN is), and after the attention was gotten via that method no one retrospectively said "I know billion dollar corporation X helped us only after thread Y was posted, but we now realize this was actually a brigade." I believe it is clear in my comment that I am not for pushing anyone who is working for free to fix it, but rather for the utilization of Docker Inc's power to fix (restore) the functionality.

I will also add that "we should not post anything that causes workers to scramble unnecessarily during the holidays" - despite the fact that I never said the opposite. I do realize it could be implied by the fact that I made the suggestion on Dec 29th, but I view this as an async rather than sync comm.

dm17 commented 8 months ago

Please let's move this thread back on track - sharing workarounds and discussing ways in which this could be worked on productively.

It was blocked immediately by this:

Sorry, I'm not sure if we will ever start supporting this as it makes the build dependant on the configuration of a specific node and limits the build to a single node.

Where is the evidence for this claim? Also, it seems to me that this person is on Docker's payroll - and yet I am still not asking them to stop their holiday and work @jedevc - I'm merely replying with a reasonable question in a session of asynchronous communication. They can choose to use the technology they own so they only see work messages during work hours, for example.

The comment also implies that the fix for this issue will not be accepted by Docker, so it definitely dissuades anyone (who isn't on the Docker payroll) from attempting a fix during their free time. That's why the argument that "this is an open source project" work for us rather than Docker Inc in this senario.

thaJeztah commented 8 months ago

I'm not surprised that someone with pronouns in their GitHub profile just created an entire victim narrative out of this.

Please keep such comments at the door. No need for this.

I'll skip over some of you other wordings, because I don't think there's anything constructive in those

Where is the evidence for this claim?

There's many things associated with this, and this feature is not trivial to implement (if possible at all). Let me try to do a short summary from the top of my head. Note that I'm not a BuildKit maintainer; I'm familiar with the overal aspects, but not deeply familiar with all parts of BuildKit.

First of all, buildx (the repository this ticket is posted in) is a client for BuildKit; the feature requested here would have to be implemented by the builder (BuildKit); Ideally this ticket would be moved to a more appropriate issue tracker, but unfortunately GitHub does not allow transferring tickets between orgs; in addition, multiple projects may need to be involved. There's no harm in having a ticket in this repository (nor to have discussions elsewhere), but ultimately this feature would have to be both accepted by upstream BuildKit and/or Moby maintainers, and implemented respectivee projects.

Now the more technical aspect of this feature request (or: this is where the fun starts).

BuildKit can be run in different settings / ways;

Depending on the above, semantics, and available features will differ.

Standalone BuildKit builders

Standalone builders are designed to be stateless; builders may have cache from prior builds, but don't persist images (only build-cache), nor do they have a concept of "networks" (other than "host" networking or "no" networking). The general concept here is that standalone builders can be ephemeral (auto-scaling cluster), and that builds don't depend on prior state. They may advertise certain capabilities (BuildKit API version, native platform/architecture), but otherwise builders are interchangeable (perform a build, and export the result (e.g. push to a registry, or export as an OCI bundle).

Given that standalone builders don't have access to "docker custom networks" (there's no docker daemon involved), it won't be possible to provide a per build option to use a specific network. The workaround mentioned in this thread is to;

In this configuration, the "host" (container) can resolve containers running in that custom network, but this only works in very specific scenarios;

All of the above combined make this a workaround that would work in only very specific scenarios. Implementing this as a native feature for standalone builders would quickly go down the rabbit-hole; "custom network attached to this builder" as well as "state of dependencies" would have to be exposed as capabilities, so that buildx can query which builder to select for a specific build, or an external orchestrator would need to be involved to construct builders (and dependencies) on-demand. Both would be moving away significantly from the current design of standalone builders (stateless, ephemeral).

As part of the Moby daemon ("Docker Engine")

When using the "default' builder on Docker Engine, BuildKit is not running as a standalone daemon, but compiled into the Docker Engine. In this scenario, BuildKit has some (but limited) access to features provided by the Docker Engine. For example, it's possible to use images that are available in the local image cache as part of the build (e.g. an image you built earlier, but that's not pushed to a registry). There's also limitations; when using "graphdrivers", the Docker Engine does not provide a multi-arch image store, so it's not possible to build multple architectures in a single build. (This will be possible in the near future and being worked on as part of the containerd image store integration).

Containers created during build are optimized for performance; containers used for build-steps tend to be very short-lived. "regular" containers as created by the Docker Engine have a non-insignificant overhead to provide all features that docker run can provide. To reduce this overhead, BuildKit creates optimized containers with a subset of those features; to further improve performance, it also skips some intermediate processes: BuildKit acts as its own runtime, and it can directly use the OCI runtime (runc) without requiring the intermediate "docker engine", and "containerd" processes.

Long story short; build-time-containers are (by design) isolated from "regular" containers; they're not managed by the docker daemon itself, won't show up in docker ps, and networking is not managed by the regular networking stack (no DNS-entries are registered in the internal DNS).

So, while the "embedded" BuildKit may have more potential options to integrate with custom networks, integrating with custom networks will require a significant amount of work; both in BuildKit (integration with the network stack) and in Moby / "docker engine" to (somehow) allow BuildKit to create different "flavors" of containers (isolated / non-isolated). This will come with a performance penalty, in addition to complexity involved (a secondary set of (short-lived) containers that can be attached to a network, but are not directly managed by the Moby / Docker Engine itself.

dm17 commented 8 months ago

Thank you for the analysis. Do you think the classic builder will be deprecated in the near future within docker, docker compose or buildx?

TafkaMax commented 8 months ago

TLDR; Which scenario would be better?

EDIT: For me personally - build time is not the primary thing I am after. I wish the solution to work. Currently using the classic builder still works, but my primary concern is that this feature disappears.

TBBle commented 8 months ago

Thank you @thaJeztah for that clarifying summary.

I wonder if it'd be worth a feature request against Docker Compose, which seems like it can do most of what the "containerised builder workaround" needs, since people in this thread have tried to do it that way already.

If Docker Compose was able to create/use/remove temporary private builder instances on its own (or the config-specified) network, along with resolving the apparent issue that a standalone BuildKit instance in a Docker container attached to a custom network doesn't get the right DNS setup for services on that custom network (see https://github.com/docker/buildx/issues/175#issuecomment-1732566254 from earlier attempts to use the "workaround" manually), then I think it can deliver the "workaround" flow fairly naturally for some of the use-cases described here.

I see a similar idea mentioned at https://github.com/docker/compose/pull/10745#issuecomment-1609671858 but I think that was about selecting an existing builder, rather than instantiating a new one. https://github.com/compose-spec/compose-spec/issues/386 is along these lines but it wants to be super-generic; it is probably too generic for this use-case, particularly if we want to tear-down the builder instance after usage or at compose-down time. In this case, the builder is more like another service in Docker Compose that compose knows how to use when building service images. (That also might be a better way to visualise and implement it, similar to existing depends_on semantics, and semantics for defining service deployment etc. in the Compose file.)

That is separate from the existing --builder flag for docker compose build introduced in Docker Compose 2.20.0 in July 2023.

That said, I'm not a Docker Compose user, so I may be overestimating what it can cover, or misunderstanding the relevant workflow. Either way, unless this is an obviously-faulty idea to Docker Compose users, it'd be better to discuss in a ticket there, to focus on the relevant use-cases that involve Docker Compose, and also focus this ticket on buildx-involved use-cases.

thaJeztah commented 8 months ago

Do you think the classic builder will be deprecated in the near future within docker, docker compose or buildx?

There's no immediate plans to actively remove the classic builder, but no active development is happening on it; consider it in "maintenance mode", and mostly to support building native Windows containers (BuildKit does not yet support Windows Containers, although work on that is in progress). The classic Builder's architecture does not give a lot of room for further expansion, so it will start to diverge / get behind BuildKit more over time. The classic builder may also not make the transition to the containerd image-store integration; there's currently a very rudimentary implementation to help the transition, but we're aware of various limitations in that implementations that may not be addressable with the classic buidler.

EDIT: For me personally - build time is not the primary thing I am after. I wish the solution to work. Currently using the classic builder still works, but my primary concern is that this feature disappears.

Perhaps it'd be useful to start a GitHub discussion; GitHub tickets aren't "great" for longer conversations and don't provide threads (not sure which repository would be best; perhaps BuildKit (https://github.com/moby/buildkit/discussions) to collect slightly more in-depth information about use-cases. I know there was a lot of contention around the original implementation, which also brought up concerns about portability and the feature being out of scope for building (see lengthy discussion on https://github.com/moby/moby/issues/10324 and https://github.com/moby/moby/pull/20987 (carried in https://github.com/moby/moby/pull/27702)).

Use-cases that I'm aware of;

But there may be other use-cases. Some of those may make more sense in a "controlled" environment (local / special purpose machine; single user), but get complicated fast in other environments. Having more data about use-cases could potentially help design around those (which may be through a different approach).

sudo-bmitch commented 8 months ago

To repeat a nearly 2 year old comment here, buildkit does support running with a custom network, it's even documented: https://docs.docker.com/build/drivers/docker-container/#custom-network

The issue isn't that it won't run on a custom network, instead, as so often happens on the internet, it was DNS. When you run a container runtime (which buildkit does) inside of a container on a custom network, it sees the DNS settings of that parent container:

$ docker network create build
9b7c83ceda7e2552e99d27c29d275936e882fd9cc9488361209bbf4421c2f180

$ docker run -it --rm --net build busybox cat /etc/resolv.conf
search lan
nameserver 127.0.0.11
options ndots:0

And as docker and other runtimes do, they refuse to use 127.0.0.11 as a DNS server, so the nested container falls back to 8.8.8.8. Here's a demo for proof:

#!/bin/sh

set -ex

docker network create custom-network

echo "hello from custom network" >test.txt
docker run --name "test-server" --net custom-network -d --rm \
  -v "$(pwd)/test.txt:/usr/share/nginx/html/test.txt:ro" nginx
server_ip="$(docker container inspect test-server --format "{{ (index .NetworkSettings.Networks \"custom-network\").IPAddress }}")"
echo "Server IP is ${server_ip}"

cat >Dockerfile <<EOF
FROM curlimages/curl as build
USER root
ARG server_ip
RUN mkdir -p /output \
 && cp /etc/resolv.conf /output/resolv.conf \
 && echo \${server_ip} >/output/server_ip.txt \
 && (curl -sSL http://test-server/test.txt >/output/by-dns.txt 2>&1 || :) \
 && (curl -sSL http://\${server_ip}/test.txt >/output/by-ip.txt 2>&1 || :)

FROM scratch
COPY --from=build /output /
EOF

docker buildx create \
  --name custom-net-build \
  --driver docker-container \
  --driver-opt "network=custom-network"
docker buildx build --builder custom-net-build --build-arg "server_ip=${server_ip}" \
  -o "type=local,dest=output" .

docker buildx rm custom-net-build
docker stop test-server
docker network rm custom-network

Running that shows that custom networks are supported, just not DNS:

$ ./demo-custom-network.sh
+ docker network create custom-network
bd5ce7361f5fc94b0da0fe32a3f5482176a6fcaca68997556d3449269c451cea
+ echo hello from custom network
+ pwd
+ docker run --name test-server --net custom-network -d --rm -v /home/bmitch/data/docker/buildkit-network/test.txt:/usr/share/nginx/html/test.txt:ro nginx
31bf9516fdc6dba7d97796be6b6c55f2a134a3050a7b1286a2ac96658e444c62
+ docker container inspect test-server --format {{ (index .NetworkSettings.Networks "custom-network").IPAddress }}
+ server_ip=192.168.74.2
+ echo Server IP is 192.168.74.2
Server IP is 192.168.74.2
+ cat
+ docker buildx create --name custom-net-build --driver docker-container --driver-opt network=custom-network
custom-net-build
+ docker buildx build --builder custom-net-build --build-arg server_ip=192.168.74.2 -o type=local,dest=output .
[+] Building 4.2s (9/9) FINISHED
 => [internal] booting buildkit                                                                    1.8s
 => => pulling image moby/buildkit:buildx-stable-1                                                 0.4s
 => => creating container buildx_buildkit_custom-net-build0                                        1.5s
 => [internal] load build definition from Dockerfile                                               0.0s
 => => transferring dockerfile: 393B                                                               0.0s
 => [internal] load metadata for docker.io/curlimages/curl:latest                                  0.6s
 => [auth] curlimages/curl:pull token for registry-1.docker.io                                     0.0s
 => [internal] load .dockerignore                                                                  0.0s
 => => transferring context: 2B                                                                    0.0s
 => [build 1/2] FROM docker.io/curlimages/curl:latest@sha256:4bfa3e2c0164fb103fb9bfd4dc956facce32  1.2s
 => => resolve docker.io/curlimages/curl:latest@sha256:4bfa3e2c0164fb103fb9bfd4dc956facce32b6c5d4  0.0s
 => => sha256:4ca545ee6d5db5c1170386eeb39b2ffe3bd46e5d4a73a9acbebc805f19607eb3 42B / 42B           0.1s
 => => sha256:fcad2432d35a50de75d71a26d674352950ae2f9de77cb34155bdb570f49b5fc3 4.04MB / 4.04MB     0.8s
 => => sha256:c926b61bad3b94ae7351bafd0c184c159ebf0643b085f7ef1d47ecdc7316833c 3.40MB / 3.40MB     0.8s
 => => extracting sha256:c926b61bad3b94ae7351bafd0c184c159ebf0643b085f7ef1d47ecdc7316833c          0.1s
 => => extracting sha256:fcad2432d35a50de75d71a26d674352950ae2f9de77cb34155bdb570f49b5fc3          0.1s
 => => extracting sha256:4ca545ee6d5db5c1170386eeb39b2ffe3bd46e5d4a73a9acbebc805f19607eb3          0.0s
 => [build 2/2] RUN mkdir -p /output  && cp /etc/resolv.conf /output/resolv.conf  && echo 192.168  0.2s
 => [stage-1 1/1] COPY --from=build /output /                                                      0.0s
 => exporting to client directory                                                                  0.0s
 => => copying files 357B                                                                          0.0s
+ docker buildx rm custom-net-build
custom-net-build removed
+ docker stop test-server
test-server
+ docker network rm custom-network
custom-network

$ cat output/resolv.conf
search lan
options ndots:0

nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844

$ cat output/server_ip.txt
192.168.74.2

$ cat output/by-dns.txt
curl: (6) Could not resolve host: test-server

$ cat output/by-ip.txt
hello from custom network
thaJeztah commented 8 months ago

And as docker and other runtimes do, they refuse to use 127.0.0.11 as a DNS server, so the nested container falls back to 8.8.8.8. Here's a demo for proof:

Hmm.. right, but that should not be the case when using --network=host. In that case, the container should inherit the /etc/resolv.conf from the host; here's a docker-in-docker running with a custom network (127.0.0.11 is the ambedded DNS resolver); a container with --network=host inherits the host's settings;

cat /etc/resolv.conf
nameserver 127.0.0.11
options ndots:0

docker run --rm --network=host alpine cat /etc/resolv.conf
nameserver 127.0.0.11
options ndots:0

Trying to do the same with a BuildKit container builder that has host networking allowed, and a build started with --network=host shows that BuildKit does not do the same; it uses the default DNS servers;

Creating a custom network and a "test-server" container attached to it;

docker network create custom-network
docker run -d --name test-server --network custom-network nginx:alpine
docker container inspect test-server --format '{{ (index .NetworkSettings.Networks "custom-network").IPAddress }}'
172.24.0.2

Create a custom builder attached to the network, and allow "host-mode" networking;

docker buildx create --name custom-net-build --driver docker-container --driver-opt network=custom-network --buildkitd-flags '--allow-insecure-entitlement network.host'
custom-net-build

Running a build with --network=host, which should run in the host's networking namespace and inherit the host's DNS configuration (127.0.0.11 - the embedded DNS);

docker buildx build --no-cache --builder custom-net-build --network=host --progress=plain --load -<<'EOF'
FROM alpine
RUN cat /etc/resolv.conf
RUN wget http://test-server
EOF

However, it looks like BuildKit sets the default DNS resolvers, not inheriting from the host;

#4 [1/3] FROM docker.io/library/alpine:latest@sha256:51b67269f354137895d43f3b3d810bfacd3945438e94dc5ac55fdac340352f48
#4 resolve docker.io/library/alpine:latest@sha256:51b67269f354137895d43f3b3d810bfacd3945438e94dc5ac55fdac340352f48 done
#4 CACHED

#5 [2/3] RUN cat /etc/resolv.conf
#5 0.038 options ndots:0
#5 0.038
#5 0.038 nameserver 8.8.8.8
#5 0.038 nameserver 8.8.4.4
#5 0.038 nameserver 2001:4860:4860::8888
#5 0.038 nameserver 2001:4860:4860::8844
#5 DONE 0.0s

#6 [3/3] RUN wget http://test-server
#6 0.057 wget: bad address 'test-server'
#6 ERROR: process "/bin/sh -c wget http://test-server" did not complete successfully: exit code: 1

So I think there's something funky going on there, and BuildKit's executor / runtime does not take host networking into account for DNS resolvers 🤔

Had a quick peek at code that I think is related to this; it looks like it logs a message about host networking; https://github.com/moby/buildkit/blob/8849789cf8abdc7d63ace61f8dc548582d22f3b5/executor/runcexecutor/executor.go#L184-L188

But after that unconditionally uses the standard /etc/resolv.conf that was generated (and used for all containers used during build); https://github.com/moby/buildkit/blob/8849789cf8abdc7d63ace61f8dc548582d22f3b5/executor/oci/resolvconf.go#L27-L118

TBBle commented 8 months ago

BuildKit incorrectly replacing localhost DNS resolvers when using host networking is https://github.com/moby/buildkit/issues/3210. There was a PR in progress just over a year ago, but it wasn't completed. https://github.com/moby/buildkit/issues/2404 seems to have had more recent activity, but looks much wider in scope than https://github.com/moby/buildkit/issues/3210.

crazy-max commented 7 months ago

BuildKit incorrectly replacing localhost DNS resolvers when using host networking is moby/buildkit#3210. There was a PR in progress just over a year ago, but it wasn't completed. moby/buildkit#2404 seems to have had more recent activity, but looks much wider in scope than moby/buildkit#3210.

Should be solved with https://github.com/moby/buildkit/pull/4524

TBBle commented 7 months ago

If you want to test the fixed BuildKit right now, create a builder with --driver-opt=image=moby/buildkit:master,network=custom-network <other params>.

In fact, I made that change in the shell script from https://github.com/docker/buildx/issues/175#issuecomment-1875486526 on line 30

--driver-opt "image=moby/buildkit:master,network=custom-network" \

and it failed the same way; and I confirmed that it was running BuildKit from commit 2873353.

I can see that the BuildKit worker is supposed to be using host-mode networking:

Labels:
 org.mobyproject.buildkit.worker.executor:         oci
 org.mobyproject.buildkit.worker.hostname:         d1295746541b
 org.mobyproject.buildkit.worker.network:          host
 org.mobyproject.buildkit.worker.oci.process-mode: sandbox
 org.mobyproject.buildkit.worker.selinux.enabled:  false
 org.mobyproject.buildkit.worker.snapshotter:      overlayfs

So I tried with the extra changes from https://github.com/docker/buildx/issues/175#issuecomment-1875540193, resulting in

docker buildx create \
  --name custom-net-build \
  --driver docker-container \
  --driver-opt "image=moby/buildkit:master,network=custom-network" \
  --buildkitd-flags "--allow-insecure-entitlement network.host"
docker buildx build --builder custom-net-build --build-arg "server_ip=${server_ip}" \
  --network host \
  -o "type=local,dest=output" .

and that worked correctly:

$ cat output/resolv.conf
nameserver 127.0.0.11
options ndots:0
$ cat output/by-dns.txt
hello from custom network

So it'd be nice if the buildx docker-container driver could automatically set up the hosted buildkit with --network host-equivalent, either when using a custom network, or when the network mode is not CNI (defaulted or explicitly "host").

Technically, I guess if BuildKit used the worker's network config rather than the specific RUN command's network config when making resolv.conf decisions, then it'd work too, since clearly the custom network is reachable even without --network host.

I'm not 100% clear on the network layering here. Maybe the RUN's container (without --network=host) is actually being attached to the custom network too, rather than inheriting from its parent, and so BuildKit is still doing the wrong thing by not producing a resolv.conf that matches this behaviour?

Anyway, the relevant buildkit change should be part of the 0.13 release and any pre-releases after 0.13b3, and automatically picked up by docker-container drivers through the default moby/buildkit:buildx-stable-1 tag, updated when the BuildKit maintainers bless a build as stable-enough.

TBBle commented 7 months ago

Okay, results of discussion with BuildKit maintainer on that PR is that making this smoother is a BuildX thing, as BuildKit is changing its default network config such that host mode will no longer be the implicit default.

For reference, here is my current working mod of https://github.com/docker/buildx/issues/175#issuecomment-1875486526

```sh #!/bin/sh set -ex docker network create custom-network echo "hello from custom network" >test.txt docker run --name "test-server" --net custom-network -d --rm \ -v "$(pwd)/test.txt:/usr/share/nginx/html/test.txt:ro" nginx server_ip="$(docker container inspect test-server --format "{{ (index .NetworkSettings.Networks \"custom-network\").IPAddress }}")" echo "Server IP is ${server_ip}" cat >Dockerfile </output/server_ip.txt \ && (curl -sSL http://test-server/test.txt >/output/by-dns.txt 2>&1 || :) \ && (curl -sSL http://\${server_ip}/test.txt >/output/by-ip.txt 2>&1 || :) FROM scratch COPY --from=build /output / EOF docker buildx create \ --name custom-net-build \ --driver docker-container \ --driver-opt "image=moby/buildkit:master,network=custom-network" \ --buildkitd-flags "--allow-insecure-entitlement=network.host --oci-worker-net=host" docker buildx build --builder custom-net-build --build-arg "server_ip=${server_ip}" \ --network host \ -o "type=local,dest=output" . docker buildx inspect custom-net-build docker buildx rm custom-net-build docker stop test-server docker network rm custom-network set +x for d in output/by-*; do echo -n "$d:"; cat $d; done ```

This should remain working even when BuildKit changes default networking to bridge, as I've explicitly passed "--oci-worker-net=host" to the buildkitd in the container.

I also added a pair of ADD calls to the Dockerfile, to demonstrate that even if you remove --network host from the docker buildx build call, they can still the custom network DNS as they operate in buildkitd's network namespace, i.e. implicitly host. Once bridge becomes the default, that will break. (I tried to have a quick test of what happens when bridge-mode becomes the default, but hit a minor issue with the buildkit image. Edit: Working around that, it seems to give the same results as host-mode, so I guess it's irrelevant here: ADD and COPY are operating in host-mode no matter what you pass into the worker mode, so the change in worker net-mode default won't break these.)

For buildx's docker-container driver, a simple nicety would be for the "custom network" driver option to automatically enable those two buildkitd flags --allow-insecure-entitlement=network.host --oci-worker-net=host since without the former, you can't pass --network host to docker buildx build, and without the latter, the default BuildKit network mode change will break builds that are working today by only using the custom network for ADD and COPY.

(Thinking about this, the kubernetes driver must be doing something similar, unless the same problem shows up there...)

Then the buildx docker-container custom network docs also needs to mention that you need to use docker buildx build --network host to access your custom network from RUN commands. That's a little semantically weird ("which host?") but it is documentable. Nicety options there are welcome, but I note that the buildkit maintainer feedback was that simply automatically adding --network host was not a good choice.

Longer term/more-fully, buildx could perhaps use the new bridge mode to expose the custom network to both the buildkitd (ADD/COPY) and the workers (RUN), removing the need to do anything special in the docker buildx build command except choose the already-configured-for-custom-network builder instance, and presumably mildly improving the network isolation in the process.

tonistiigi commented 7 months ago

https://github.com/docker/buildx/issues/2255 https://github.com/docker/buildx/issues/2256

dm17 commented 7 months ago

@TBBle Does that mean that one would be able to use bake with a compose file?

TBBle commented 7 months ago

I believe so, yes. AFAIK Bake is just driving these same components underneath, and I think all the relevant configuration options as used in the test-case I extended are exposed in compose.yaml; but note that I am not a Compose user so I haven't tried this, and some of the compose.yaml fields will not be usable as they will be passing parameters to docker buildx build that instead need to go to docker buildx create.

Edit: Actually no. I tried to get this working, and realised that I don't see how you specify a compose file that actually starts services, and then runs builds that rely on those services; the docker compose up --build command wants to build things before starting any services. (Apparently in 2018 depends_on affected building, but that wasn't intended behaviour)

So I'd need to see a working (and ideally simple) example (i.e. from non-buildx) to understand what I'm trying to recreate.

Also, as noted in https://github.com/docker/buildx/issues/175#issuecomment-1875172770, compose doesn't currently support creating a buildx builder instance, so the builder instance would need to be created first, which means the network must be created first and then referenced in the compose.yaml as external: true.

I also just remembered you specified docker buildx bake with a compose.yaml, not docker compose up which I was testing with. I didn't think docker buildx bake ran services from the compose.yaml, I understood it just parsed out build targets.

So yeah, without an example of what you think ought to work, I can't really advance any compose/bake discussion further.


If you're just using docker compose up to bring up the relevant services on the custom network, then docker buildx bake to do the build against those, then it should work, but you'd still need to either pre-create the custom network and builder before running the tools, or between compose-up and buildx-bake, create the custom builder attached it to the compose-created network.

I recrated the same test-script in this style:

networks:
  custom-network:
    # Force the name so that we can reference it when creating the builder
    name: custom-network

services:
  test-server:
    # Force the name so that we can reference from the build-only Dockerfile
    container_name: test-server
    image: nginx
    networks:
      - custom-network
    volumes:
      - type: bind
        source: ./test.txt
        target: /usr/share/nginx/html/test.txt
        read_only: true
  build-only:
    build:
      network: host
      dockerfile_inline: |
        FROM curlimages/curl as build
        USER root
        ADD http://test-server/test.txt /output/by-dns-add.txt
        RUN mkdir -p /output \
         && cp /etc/resolv.conf /output/resolv.conf \
         && (curl -sSL http://test-server/test.txt >/output/by-dns.txt 2>&1 || :)
        RUN for d in /output/by-*; do echo -n "$$d:"; cat $$d; done

        FROM scratch
        COPY --from=build /output /

and then created test.txt with the desired contents ("hello from custom network"), and:

$ docker compose up test-server --detach
$ docker buildx create --name custom-net-build --driver docker-container --driver-opt "image=moby/buildkit:master,network=custom-network" --buildkitd-flags "--allow-insecure-entitlement=network.host"
$ docker buildx bake --builder custom-net-build --progress plain --no-cache --set=build-only.output=type=local,dest=output
...
#11 [build 4/4] RUN for d in /output/by-*; do echo -n "$d:"; cat $d; done
#11 0.071 /output/by-dns-add.txt:hello from custom network
#11 0.072 /output/by-dns.txt:hello from custom network
#11 DONE 0.1s
...
$ docker buildx rm custom-net-build
$ docker compose down
$ for d in output/by-*; do echo -n "$d:"; cat $d; done
output/by-dns-add.txt:hello from custom network
output/by-dns.txt:hello from custom network

Note that in the docker buildx bake call, --progress plain --no-cache --set=build-only.output.type=output is only there so you can see the output in the log, so I can rerun the command and see the output each time, and to simulate the shellscript's -o "type=local,dest=output" to dump the final image into a directory so you can examine the results, respectively. (Weirdly, --set=build-only.output.type=<any string> also had this effect, which I'm pretty sure is a bug/unintended feature.)

So that seems to work, yeah. If compose and/or bake were able to define builders to create (and ideally agree on the real name of either the builder or the network in order to avoid hard-coding the network name as I did here) then it would just be a relatively simple compose up -d && bake && compose down.

If that's what you want, then I'd suggest opening a new feature request for it: I think it makes more sense to be part of compose so it can be used for compose's own builds too but SWTMS (See What The Maintainers Say). (Also, docker buildx bake doesn't know about compose name-mangling, so if this was part of bake then you'd still need to give the network an explicit name; either way you end up hard-coding the container_name for the services accessed from the Dockerfile, so local parallelisability is of the bake call only, not the whole stack)

(If you're feeling really clever, you could include a builder as a normal service in the compose spec, and then use the remote buildx driver to connect to it, making the docker buildx create call more generic; however, I haven't used the remote buildx driver so can't promise that this ends up simpler... it probably ends up more complex as you have to manage TLS and however you reach the custom-network-based builder from your buildx-hosting instance. The docker-container and kubernetes drivers avoid all this by using docker exec and kubectl exec equivalents.)

dm17 commented 5 months ago

@TBBle Thanks; that's also what I found. I appreciate the suggestions but don't yet have the mental energy earmarked in order to complete any. I'll wait a little longer in hopes that someone else streamlines a solution before attempting again :)