moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.21k stars 1.16k forks source link

Proposal: hooks for `RUN` instructions (use cases: reproducible builds, cross-compilation, malware detection, ...) #4576

Closed AkihiroSuda closed 2 months ago

AkihiroSuda commented 10 months ago

I'd like to propose a hooking mechanism for RUN instructions of Dockerfile.

e.g.,

buildctl build \
  --frontend dockerfile.v0 \
  --opt hook="$(cat hook.json)"

with hook.json as follows:

{
  "RUN": {
    "entrypoint": ["/dev/.dfhook/entrypoint"],
    "mounts": [
       {"from": "example.com/hook", "target": "/dev/.dfhook"},
       {"type": "secret", "source": "something", "target": "/etc/something"}
    ]
  }
}

This will let the frontend treat RUN foo as:

RUN \
  --mount=from=example.com/hook,target=/dev/.dfhook \
  --mount=type=secret,source=something,target=/etc/something \
  /dev/.dfhook/entrypoint foo

docker history will still show this as RUN foo.

[!NOTE]

The proposed json schema may change. See the PR for the latest status:

Use cases

Reproducible builds

A hook can be used for wrapping apt-get command to use snapshot.debian.org for reproducing package versions without modifying the Dockerfile.

The /dev/.dfhook/entrypoint script can be like this:

#!/bin/bash
set -eu -o pipefail

: "${SOURCE_DATE_EPOCH:=$(stat --format=%Y /etc/apt/sources.list.d/debian.sources)}"
snapshot="$(printf "%(%Y%m%dT%H%M%SZ)T\n" "${SOURCE_DATE_EPOCH}")"
. /etc/os-release

# Rewrite /etc/apt to use snapshot.debian.org
cp -a /etc/apt /etc/apt.bak
rm -f /etc/apt/sources.list.d/debian.sources
cat <<EOF >>/etc/apt/sources.list
deb [check-valid-until=no] http://snapshot.debian.org/archive/debian/${snapshot} ${VERSION_CODENAME} main
deb [check-valid-until=no] http://snapshot.debian.org/archive/debian-security/${snapshot} ${VERSION_CODENAME}-security main
deb [check-valid-until=no] http://snapshot.debian.org/archive/debian/${snapshot} ${VERSION_CODENAME}-updates main
EOF

# Run the command
set +e
"$@"
status=$?
set -e

# Restore /etc/apt
rm -rf /etc/apt
mv /etc/apt.bak /etc/apt

exit $status

A hook may also push/pull dpkg blobs to an OCI registry (or whatever) for efficient caching.

Cross-compilation

xx-apt, etc. (https://github.com/tonistiigi/xx) can be reimplemented as a hook.

Malware detection

A hook may use seccomp, etc. to hook the syscalls and detect malicious actions, etc.

Enterprise networking

Enterprise networks often require installing a MITM proxy cert. This can be easily automated with a hook.

FAQs

AkihiroSuda commented 9 months ago

@tonistiigi @thaJeztah SGTY?

thaJeztah commented 9 months ago

Would these hooks run inside the container that's run as part of the RUN, or is this all running on the host? If it's inside the container, would this be similar to a custom SHELL (to set a custom entry point for RUN steps)?

AkihiroSuda commented 9 months ago

PR:

AkihiroSuda commented 7 months ago

Would these hooks run inside the container that's run as part of the RUN, or is this all running on the host?

Inside.

If it's inside the container, would this be similar to a custom SHELL (to set a custom entry point for RUN steps)?

Yes, somewhat similar.

(I noticed I accidentally edited @thaJeztah's original comment :sweat_smile: . Now reverted. )

tonistiigi commented 5 months ago

My initial impression is that this is not where we want to take Dockerfiles. Dockerfiles should be self-contained and guaranteed to work. This makes the outcome of a build undefined and likely broken if the author repository and hook come from a different codebase. If the hook is useful then why doesn't Dockerfile author integrate it into their build process or make it so that it can be turned on by some config variable and tested.

Even if used for some hand-written script, it will make inefficient builds because the "hooks" leak to everywhere. The foundation of a good Dockerfile is to define minimal dependencies for every point of your build. That gives speed, efficient caching and small images. From that point, if the problem is that you wish to build a Dockerfile from some external source that you can't modify but add some modifications, I would even consider some option to load a patch file over it before it is processed. At least in that case the patch can choose exactly what modifications are needed. Atm it is limited to what can be modified but once you add a modification it applies everywhere.

I'd like to understand more what is the actual problem with reproducible builds and how is this solving it. We want to make it easy for image authors to make their builds reproducible, but I'm not sure how this is helping (unless we ask image author to now call make instead of docker build that would attach some hook to it, but that doesn't improve docker build and increases fragmentation and untested branches).

If the fundamental problem is to improve code reuse and reduce need to do copy-paste to write good Dockerfiles then I'm very interested in solving that problem.

xx-apt, etc. (https://github.com/tonistiigi/xx) can be reimplemented as a hook.

I think it is the opposite. xx shows the limitation of this approach. You can't just mask over apt and have magic cross-compilation. User needs to choose what packages make sense for what architecture, and for what stage. Main stage needs to be $BUILDPLATFORM, other args need to be exposed, dependencies need to be split up, compilation result needs to be verified etc. Additionally user is left with a broken behavior as some articles will write that they can "just add a hook", but the repo author has never tested it and it will almost never work. If it works, it likely produces something inefficient or broken in runtime. I do agree though that including xx in your project may be considered a bit hacky atm. and maybe ideally you should be able to include xx directly from source (eg. like it is possible for the helper targets in that repo via bake) or maybe patch it locally. I'd be interested in finding better solutions for these cases if you agree.

Because it affects the history object in OCI Image Config and decreases reproducibility

I don't get the reproducibility part, but history should give a representation of how the image was built. If it does not, then it is incorrect. Latest versions also create provenance attestations, that supercedes history - it is always precise and can't be modified by frontend.

AkihiroSuda commented 5 months ago

If the hook is useful then why doesn't Dockerfile author integrate it into their build process or make it so that it can be turned on by some config variable and tested.

Because the upstream Dockerfile authors do not want to accept PRs that complicates /etc/apt/sources.list in their Dockerfiles:

The purpose of the hook is to leave those upstream Dockerfiles kept unmodified, and to eliminate necessity of negotiation toward merging such PRs. The upstream Dockerfile authors are not expected to use the hooks; Only downstream reproducers have to enable the hooks so as to reproduce dpkg versions.

From that point, if the problem is that you wish to build a Dockerfile from some external source that you can't modify but add some modifications, I would even consider some option to load a patch file over it before it is processed. At least in that case the patch can choose exactly what modifications are needed. Atm it is limited to what can be modified but once you add a modification it applies everywhere.

I'd like to understand more what is the actual problem with reproducible builds and how is this solving it. We want to make it easy for image authors to make their builds reproducible, but I'm not sure how this is helping (unless we ask image author to now call make instead of docker build that would attach some hook to it, but that doesn't improve docker build and increases fragmentation and untested branches).

The upstream image authors do not need to use hooks. Only downstream reproducers need to use hooks.

No action is needed on the upstream image authors side, except for accepting small PRs such as ARG SOURCE_DATE_EPOCH:

I don't get the reproducibility part, but history should give a representation of how the image was built. If it does not, then it is incorrect. Latest versions also create provenance attestations, that supercedes history - it is always precise and can't be modified by frontend.

The problem is that modifying the history object breaks the OCI image config digest, and hence the manifest digest. diffoci can ignore these "boring" differences, but ideally it should be possible to just reproduce the entire manifest digest

AkihiroSuda commented 5 months ago

If my hook proposal isn't going to be accepted, I guess I'll try to reimplement this as a custom OCI runtime and maybe a snapshotter plugin, but that would ruin the user experience for buildx, as it would incur specifying a custom moby/buildkit image

tonistiigi commented 5 months ago

What do you mean by "you can't modify but add some modifications" ?

Sorry, confusing wording. I meant you "can't modify the repo where the Dockerfile is located but still want to make some changes to the Dockerfile".

What is "patch"?

Just regular patch(1) diff files.

The upstream image authors do not need to use hooks. Only downstream reproducers need to use hooks.

I don't get the difference between downstream fork and downstream hook. Who is the downstream here, eg. for your official-images patches, is it some reproducible-builds-org publishing their own versions of official images or regular users. If it is org then why not just fork for much more precise version without the limitations of hooks and seems like we are making a feature for a specific repo case. If it is regular users then this does not look something that can be recommended as it is hard to use, untested, breaks easily, and produces inefficient builds.

The problem is that modifying the history object breaks the OCI image config digest, and hence the manifest digest. diffoci can ignore these "boring" differences, but ideally it should be possible to just reproduce the entire manifest digest

If one build produces reproducible timestamps and another does not then they will be different and it is correct that they are different if they were built differently. If some build time artifacts should not end in the final image then multi-stage build patterns should be used for that, not mounting secret.

If my hook proposal isn't going to be accepted, I guess I'll try to reimplement this as a custom OCI runtime and maybe a snapshotter plugin, but that would ruin the user experience for buildx, as it would incur specifying a custom moby/buildkit image

I don't get the "custom OCI runtime and maybe a snapshotter plugin" part. If you want to create own variants of official images then options are fork, apply patches or create external buildkit frontend.

AkihiroSuda commented 5 months ago

regular patch(1) diff files.

It is quite hard to maintain patch(1) diff files in the robust way.

I don't get the difference between downstream fork and downstream hook. Who is the downstream here, eg. for your official-images patches, is it some reproducible-builds-org publishing their own versions of official images or regular users.

I just expect the Docker Official Image upstream (such as docker.io/library/httpd) to be reproducible by any regular downstream users. There does not need to be any reproducible-builds-org publishing their own downstream variants.

If it is regular users then this does not look something that can be recommended as it is hard to use, untested, breaks easily, and produces inefficient builds.

The buildx CLI may have a new flag like --repro apt-snapshot=true,apt-snapshot-source=snapshot.debian.org to inject a well-maintained hook without a mess.

It is true that snapshot.debian.org is quite inefficient due to lack of server resources, but snapshot.ubuntu.com is quite fast. There are also third-party fee-charging snapshot providers such as https://stablebuild.com.

If one build produces reproducible timestamps and another does not then they will be different and it is correct that they are different if they were built differently.

I expect the Docker Official Image infra to eventually adopt rewrite-timestamp=true so as to keep OCI manifest digests reproducible

If some build time artifacts should not end in the final image then multi-stage build patterns should be used for that, not mounting secret.

I don't expect that mounting secrets is necessary for repro builds. It might be still useful for pushing/pulling package caches to a remote server (OCI registry or whatever), though.

I don't get the "custom OCI runtime and maybe a snapshotter plugin" part.

buildkitd --oci-worker-binary=RUNC could be specified to a runc-compatible runtime that wraps runc exec in a hook script. Probably this design does not need to implement a new snapshotter plugin, but I mentioned snapshotter plugins as I might be overlooking something.

If you want to create own variants of official images then options are fork, apply patches or create external buildkit frontend.

No, I don't want to create my own variants of DOI. I'm trying to explore how we can reach consensus on making the plain vanilla upstream DOI reproducible with the least effort. 🙂

tonistiigi commented 5 months ago

There does not need to be any reproducible-builds-org publishing their own downstream variants.

I agree. But that would happen with this PR as hooks would be used to work around the upstream and not called by official images pipelines.

The buildx CLI may have a new flag like --repro apt-snapshot=true,apt-snapshot-source=snapshot.debian.org to inject a well-maintained hook without a mess.

That looks way too opinionated logic for a flag. Builds should be configured by build-args/contexts, a generic repro strategy that could work is to use provenance attestation of a previously built image and provide reproduction guarantees from that data.

No, I don't want to create my own variants of DOI. I'm trying to explore how we can reach consensus on making the plain vanilla upstream DOI reproducible with the least effort. 🙂

Looking at some of the feedback of your PRs there looks like one of the issues seems to be that maintainers suggest that for many DOI images the reproduction timestamp that makes sense for the image should be one defined by the source artifact (eg. git commit of the upstream source, or file timestamp in targz, for example https://github.com/docker-library/golang/pull/505/files#diff-12a996ea1ea6ff196d20e1af5aaa3cc1deed6c9f547979cb19dba4bf7325a15cR76 ) rather than one provided manually with epoch ARG. Atm one of the issues of that approach is that you can only use such timestamp for single RUN and not set it with ARG for rest of the Dockerfile. For builds with Git context https://github.com/moby/buildkit/issues/3565 could fix it, but that probably does not cover all DOI cases(atm). I'd also like these source artifacts to be tracked directly by BuildKit provenance attestation (with checksum verification and snapshotting etc).

Just to though out some ideas, one hacky approach for that could be:

FROM scratch AS src
ADD https://ftp.gnu.org/gnu/bash/bash-$_BASH_BASELINE.tar.gz /

FROM debian
EPOCH --from=src /
RUN env

Can we think of a better approach that covers this? Another approach could be have some known stage name that can be used to collect metadata about the build (could be used for dynamic labels/env etc. in addition to setting epoch).

AkihiroSuda commented 5 months ago

There does not need to be any reproducible-builds-org publishing their own downstream variants.

I agree. But that would happen with this PR as hooks would be used to work around the upstream and not called by official images pipelines.

Right. The hook does not need to be called on the upstream DOI side.

To sum up, what has to be done on the upstream DOI side is:

The buildx CLI may have a new flag like --repro apt-snapshot=true,apt-snapshot-source=snapshot.debian.org to inject a well-maintained hook without a mess.

That looks way too opinionated logic for a flag. Builds should be configured by build-args/contexts, a generic repro strategy that could work is to use provenance attestation of a previously built image and provide reproduction guarantees from that data.

The snapshot server data cannot be retrieved from the provenance, as the upstream DOI will continue to use the upstream non-snapshot debian.org due to the performance issue and the flakiness of snapshot.debian.org.

So, downstream reproducers will have to explicitly opt-in to snapshot.debian.org or paid equivalent such as stablebuild.com for repro builds.

No, I don't want to create my own variants of DOI. I'm trying to explore how we can reach consensus on making the plain vanilla upstream DOI reproducible with the least effort. 🙂

Looking at some of the feedback of your PRs there looks like one of the issues seems to be that maintainers suggest that for many DOI images the reproduction timestamp that makes sense for the image should be one defined by the source artifact (eg. git commit of the upstream source, or file timestamp in targz, for example https://github.com/docker-library/golang/pull/505/files#diff-12a996ea1ea6ff196d20e1af5aaa3cc1deed6c9f547979cb19dba4bf7325a15cR76 ) rather than one provided manually with epoch ARG.

Will try to update the PRs next week to add fine-grained SOURCE_DATE_EPOCH, but most Dockerfiles still need to have the "global" ARG SOURCE_DATE_EPOCH for instructions (e.g., RUN useradd) that are not associated with any source material.

EPOCH --from=src /

I assume we are in general reluctant to add new instructions

tonistiigi commented 5 months ago

but most Dockerfiles still need to have the "global" ARG SOURCE_DATE_EPOCH for instructions (e.g., RUN useradd) that are not associated with any source material.

I think the intention would be that SOURCE_DATE_EPOCH for other commands is also defined by the upstream artifact repo. I think almost all DOI images are effectively repackaging of some upstream and it probably makes sense for the release of that upstream to define epoch.

I assume we are in general reluctant to add new instructions

If there is a good use case then it is not out of the question. We of course need to make sure that the solution is flexible and future proof before we commit to backwards compatibility.

AkihiroSuda commented 5 months ago

but most Dockerfiles still need to have the "global" ARG SOURCE_DATE_EPOCH for instructions (e.g., RUN useradd) that are not associated with any source material.

I think the intention would be that SOURCE_DATE_EPOCH for other commands is also defined by the upstream artifact repo. I think almost all DOI images are effectively repackaging of some upstream and it probably makes sense for the release of that upstream to define epoch.

DOI has been already setting --build-arg SOURCE_DATE_EPOCH to the git timestamp of the Dockerfile repo, but the arg has not been consumed due to lack of ARG SOURCE_DATE_EPOCH in Dockerfiles.

https://github.com/docker-library/meta-scripts/blob/19816aabb173f61ed763013f409948200a3ad55c/sources.sh#L60

I assume we are in general reluctant to add new instructions

If there is a good use case then it is not out of the question. We of course need to make sure that the solution is flexible and future proof before we commit to backwards compatibility.

Right, but I guess this new instruction can be proposed separately.

Is there any other remaining blocker toward merging the hook PR?

tianon commented 5 months ago

Defining an ARG at all screws up the image metadata (specifically, docker history becomes even more inscrutable/impossible to parse than normal), so we strongly avoid ARG within DOI. Defining an ARG that passes through some magic number just for reproducible timestamps on the filesystem/embedded in other binaries, especially when an alternative source of that information exists which is more meaningful, feels backwards.

We do not explicitly set --build-arg SOURCE_DATE_EPOCH but rather set it as an appropriate environment variable for docker buildx build itself (SOURCE_DATE_EPOCH=xxx docker buildx build), which functionally may not make much difference, but the semantic difference is that we are providing an appropriate pin for the timestamps which docker buildx build itself is responsible for creating, not necessarily intending to pass that forward into the image build environment (which again, feels like breaking the abstraction that docker builds are / are intended to be).

Regarding enabling rewrite-timestamp=true, are there any side effects? In other words, why is the behavior opt-in instead of opt-out or even just enabled by default and/or automatically enabled when an appropriate SOURCE_DATE_EPOCH is set? What are the downsides, and how do we communicate them to our users when they ask us about the metadata of the images we publish? (Which is a thing that's already surprised quite a few people in our images since we've started setting SOURCE_DATE_EPOCH and the timestamps on the metadata of layers of an image were no longer necessarily always linear, which is technically correct, but also surprising behavior, especially after ~10 years of that not being the way this works.)

AkihiroSuda commented 5 months ago

@tianon Thank you for replying.

Defining an ARG at all screws up the image metadata (specifically, docker history becomes even more inscrutable/impossible to parse than normal), so we strongly avoid ARG within DOI. Defining an ARG that passes through some magic number just for reproducible timestamps on the filesystem/embedded in other binaries, especially when an alternative source of that information exists which is more meaningful, feels backwards.

Updated the PRs to take SOURCE_DATE_EPOCH from the source material (as in golang):

SOURCE_DATE_EPOCH="$(find /usr/src/bash -type f -exec stat -c '%Y' {} + | sort -nr | head -n1)"

~The Dockerfiles still have the global ARG SOURCE_DATE_EPOCH for commands that are not associated with the source material.~ e.g.,

~Let me know whether ARG SOURCE_DATE_EPOCH is acceptable for this condition, or you would rather prefer SOURCE_DATE_EPOCH=0 (1970-01-01) for these commands.~ (EDIT: updated the PRs to use SOURCE_DATE_EPOCH=0)

~The UX issue of the docker history CLI can be probably solved by adding a new flag like docker history --hide-args.~

Regarding enabling rewrite-timestamp=true, are there any side effects?

The current implementation of rewrite-timestamp=true is incompatible with unpack=true, but this limitation is expected to be fixed in future: https://github.com/moby/buildkit/blob/715276d7423f006dab1c96bd6910d4ca55faa230/exporter/containerimage/export.go#L290

In other words, why is the behavior opt-in instead of opt-out or even just enabled by default and/or automatically enabled when an appropriate SOURCE_DATE_EPOCH is set?

I think we can enable it by default when we are confident that the implementation is stable enough.

What are the downsides, and how do we communicate them to our users when they ask us about the metadata of the images we publish?

The SOURCE_DATE_EPOCH metadata has been already present in the provenance https://explore.ggcr.dev/?blob=hello-world@sha256:8b338b9c4c5a42cc78090f082d81ed4010c013d8ec03311051a613571f7f988a&mt=application%2Fvnd.in-toto%2Bjson&size=4950

(Which is a thing that's already surprised quite a few people in our images since we've started setting SOURCE_DATE_EPOCH and the timestamps on the metadata of layers of an image were no longer necessarily always linear, which is technically correct, but also surprising behavior, especially after ~10 years of that not being the way this works.)

This issue was already fixed in:

AkihiroSuda commented 5 months ago

Updated the PRs above to completely remove ARG SOURCE_DATE_EPOCH and use ENV SOURCE_DATE_EPOCH 0 instead for commands that are not associated with the source material. PTAL.

AkihiroSuda commented 5 months ago

Aside from the reproducible builds, the hooks should be also useful for injecting MITM proxy certs that are needed on enterprise networks where all the HTTPS traffics have to be decrypted and monitored.

I'm hoping we can see some progress toward merging the PR:

AkihiroSuda commented 4 months ago

Is there any remaining concern?

AkihiroSuda commented 2 months ago

I'm withdrawing this proposal and going to implement a standalone translator that consumes Dockerfile and generate a new Dockerfile:

program-name-to-be-decided translate --hook=hook.json < Dockerfile > Dockerfile.new

The drawback of the new approach is that it can't reproduce the OCI Image Config digest as it can't retain the docker history object. This drawback might be practically acceptable (although looks quite ugly), as https://github.com/reproducible-containers/diffoci has --ignore-history flag to allow comparing OCI Image Configs excluding the docker history object.

I'm closing https://github.com/moby/buildkit/pull/4669 but I still want the SOURCE_DATE_EPOCH PRs for DOI (https://github.com/docker-library/official-images/issues/16044#issuecomment-2245981307) to be merged.