moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.22k stars 1.16k forks source link

Build with --network should support ADD as well as RUN #4254

Open bchallenor opened 6 years ago

bchallenor commented 6 years ago

When building, the --network flag affects RUN statements but not ADD statements. I would like it to support ADD too. This would not make builds any less reproducible, because of course you can already use RUN curl, but ADD would give better control of cache invalidation.

Test case:

$ sudo docker network create test
587abde5a585f2dc27f8ab92e8f75d2ea58b45bac52f85694940bf038e23a6e7

$ sudo docker run --network test --name web --rm --detach python:3 python -m http.server 80
dcf6d0e8149667c8c2227a5da20449d1b0057a38c1fa1f4790fbdf6c03052bcc

$ cat Dockerfile
FROM alpine:latest

RUN apk add --no-cache curl

RUN curl --remote-name http://web/etc/passwd

ADD http://web/etc/passwd /

$ sudo docker build --network test --no-cache .
Sending build context to Docker daemon  3.072kB
Step 1/4 : FROM alpine:latest
 ---> 3fd9065eaf02
Step 2/4 : RUN apk add --no-cache curl
 ---> Running in 24a6911d6fed
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/community/x86_64/APKINDEX.tar.gz
(1/3) Installing ca-certificates (20171114-r0)
(2/3) Installing libcurl (7.58.0-r0)
(3/3) Installing curl (7.58.0-r0)
Executing busybox-1.27.2-r7.trigger
Executing ca-certificates-20171114-r0.trigger
OK: 5 MiB in 14 packages
Removing intermediate container 24a6911d6fed
 ---> 985465ec64bb
Step 3/4 : RUN curl --remote-name http://web/etc/passwd
 ---> Running in c555cfaead80
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1197  100  1197    0     0   194k      0 --:--:-- --:--:-- --:--:--  194k
Removing intermediate container c555cfaead80
 ---> 8074dbd6bb96
Step 4/4 : ADD http://web/etc/passwd /
ADD failed: Get http://web/etc/passwd: dial tcp: lookup web: no such host
pslacerda commented 6 years ago

ADD like COPY is to copy external resources, not for in-Docker file transfer. You may want to expose your web container.

bchallenor commented 6 years ago

These are external resources - it's a package manager proxy that is running in the other container. It behaves the same as a public URL, but with better performance.

As I said, you can achieve the same (apart from the desirable cache invalidation behaviour) with RUN curl. It seems inconsistent that ADD <URL> doesn't also use the supplied network.

pslacerda commented 6 years ago

I see your point. It is that makes more sense to external resources be exposed.

But I agree with you that RUN and ADD are in dissonance.

vdemeester commented 6 years ago

cc @dnephin @tonistiigi @thaJeztah

thaJeztah commented 6 years ago

ADD is not executed inside a container; it's the daemon that fetches the resources (either local files, or from remote). The daemon's networking is not part of the container-container network, thus cannot contact the container if that container doesn't publish its ports.

Configurations like these was part of the reason for the long delay before --network was added to docker build (docker build is dependent on other containers, and networks, to be created before the build can be performed).

It behaves the same as a public URL, but with better performance.

Does it work for you if you publish that container on localhost?

bchallenor commented 6 years ago

Yes, it does work if the proxy container and the build are both run on the host network, with the port exposed but bound only to localhost. Sometimes it's nice to be able to run the package manager command inside the built image though (to test if the image needs rebuilding, if any deps have changed), but that means that the image also must be run on the host network, which is a bit restricting.

A different feature request that occurred to me but might be more generally useful to other people would be adding a flag to RUN to give it similar cache-busting functionality to ADD - something that meant "always run this command, but if it produces an identical layer to the last build, then continue to use the previous build's cached layers". Example:

RUN expensive-command

RUN --try-cache curl --remote-name http://web-container/info

RUN another-expensive-command

Then RUN --try-cache curl (flag name up for debate!) behaves like ADD, i.e. another-expensive-command is run if and only if the contents of http://web-container/info changed.

RUN --try-cache apt-get update would, I think, be widely useful to other people, and would align with the general deprecation of ADD.

tonistiigi commented 6 years ago

Per-command cache control is added in https://github.com/moby/buildkit/ . There is no dockerfile syntax to set it yet though but discussed in https://github.com/moby/buildkit/issues/242 .

dnephin commented 6 years ago

I think ADD should be deprecated and removed.

tonistiigi commented 6 years ago

@dnephin Not removed, but replaced with FROM https://foo.tgz, COPY --from=https://foo.tgz, FROM git://github.com/foo/bar.git#master