genuinetools / img

Standalone, daemon-less, unprivileged Dockerfile and OCI compatible container image builder.
https://blog.jessfraz.com/post/building-container-images-securely-on-kubernetes/
MIT License
3.89k stars 230 forks source link

Concurrency issues? #218

Closed ChrisPenner closed 5 years ago

ChrisPenner commented 5 years ago

Hello, I'm hoping to use img in our CI system to replace docker build and speed up CI times and ran into an interesting inconsistency between img and docker build

Expected Behaviour

docker build and img should create the same output image when given the same args and inputs.

In this case I expect all artifacts and files from previous FROM commands to not exist in the output image unless explicitly copied.

Actual Behaviour

In certain cases (seems to be a race condition of some kind) files from previous build stages somehow end up in the final image built by img; In other cases the build fails entirely with a file not found error, referencing a file being COPY'd from a previous build stage.

Context

I'm running into an issue with multi-stage builds where artifacts from previous build stages are showing up in the final built image; e.g.

FROM build-image as build-image

FROM bash

COPY --from=build-image /dist/my-exe /bin/my-exe

In this example, when building with docker I get an image based on bash with a copy of my-exe in /usr/bin/my-exe and /dist/ does not exist. However when I build with img I find that the entire /dist/directory still exists in the built image for some reason.

I've tried several times without success to shrink the minimal failing test case and I'm having trouble determining the issue here; I think that perhaps it's related to build-kit's concurrent builds, is there a way to turn that feature off? It seems either the second stage of the build is being run based on the first image's FROM OR that the file-system itself is somehow being carried along.

ALSO the builds are non-deterministic; approx. 50% of the time I get an error about failing to find the folder which I'm copying. Rebuilding without making any changes sometimes succeeds, sometimes fails.

 => ERROR [stage-1 2/3] COPY --from=intermediate /dist/brig-schema /usr/b  0.0s
------
 > [stage-1 2/3] COPY --from=intermediate /dist/brig-schema /usr/bin/brig-schema:
------
failed to solve: failed to walk /tmp/buildkit-mount669332827/dist: lstat /tmp/buildkit-mount669332827/dist: no such file or directory

This also implies to me that there's some sort of concurrency issue.

Reproducing

I've had trouble reproducing the issue in a simple test case, but you're welcome to try building our project (it's open-source).

I've created a branch here; you can clone it down and run docker-build.sh and img-build.sh respectively. img-build.sh runs a copy of the concourse/builder image which simply uses img to build the image and then save it; when running the example it mounts the repo's image folder and dumps an image there which you can docker load.

You should notice that img-build.sh sometimes fails; and when it succeeds there exists a /dist folder which does NOT exist in the image built by docker. You can run docker run --entrypoint=sh --it <image-name> to poke around.

Sorry the reproducible case is so complicated, I suspect the race condition only occurs when the situation is sufficiently complex.

Let me know if there's more context I can provide; or anything I can do to clarify. I understand this isn't such an easy thing to investigate; but let me know if you have any ideas or if there's some way to run img without concurrency to determine if that's part of the issue.

AkihiroSuda commented 5 years ago

cc @tonistiigi

tonistiigi commented 5 years ago

Can you open this in buildkit please. I tested with Docker with Buildkit enabled and didn't hit it but the shell script that runs img did. I wonder if this is schema1 related. Have you every hit it without these specific quay base images.

ChrisPenner commented 5 years ago

Oh great! I'm glad this is something you were already aware of. Thanks for the quick response! Is there an easy way to get a docker image of a new build of img which uses the new buildkit version?

Thanks for your time @tonistiigi

AkihiroSuda commented 5 years ago

I'll open PR for revendoring buildkit after release of BuildKit v0.4 (soon)

ChrisPenner commented 5 years ago

Thanks!