containers / build

another build tool for container images (archived, see https://github.com/rkt/rkt/issues/4024)
Apache License 2.0
342 stars 80 forks source link

Store DiffIDs correctly in OCI config #307

Closed mwuertinger closed 7 years ago

mwuertinger commented 7 years ago

Disable gzip compression for OCI image layers in order to get Docker to run it properly. If gzip compression is turned on, Docker complains about a layer checksum mismatch because it calculates the checksum after decompressing it.

Here's how to reproduce it:

  1. Build an OCI image (containing a trivial hello world application) and push it to Docker Hub using skopeo:
    $ ./acbuild begin --build-mode=oci
    $ ./acbuild copy hello /hello
    $ ./acbuild set-exec /hello
    $ ./acbuild write myimage.oci
    $ mkdir myimage
    $ tar xf myimage.oci -C myimage
    $ ./skopeo copy oci:myimage docker://mwuertinger/oci-test
  2. Trying to pull the image with Docker yields an error:
    
    $ docker pull mwuertinger/oci-test
    Using default tag: latest
    latest: Pulling from mwuertinger/oci-test

d55c176aaf3d: Pull complete layers from manifest don't match image configuration



I traced this error back to https://github.com/moby/moby/blob/master/distribution/pull_v2.go#L638 where the calculated checksum of the downloaded layer is compared to the layer specified in the image manifest. As it turns out, Docker decompresses the layer before calculating the expected checksum which yields an unexpected result and the check fails.

This can be solved by skipping layer compression. However, I do not know whether this is a bug in Docker or in acbuild. Therefore I do not expect this pull request to get merged but see it more as a proof of concept of how it could work.
cgonyeo commented 7 years ago

I'd be fine merging this if the OCI spec doesn't say that layers may be compressed. If this is however a bug in docker, it should be fixed there.

vbatts commented 7 years ago

This could also be handled in skopeo. Ping @runcom

On Thu, May 4, 2017, 16:55 Derek Gonyeo notifications@github.com wrote:

I'd be fine merging this if the OCI spec doesn't say that layers may be compressed. If this is however a bug in docker, it should be fixed there.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/containers/build/pull/307#issuecomment-299306536, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEF6RkxFodwuBqtF4HiQlWnAAwVrgENks5r2jtWgaJpZM4NRK8v .

mwuertinger commented 7 years ago

Handling this in skopeo might be complex because the layer checksums change which requires rewriting the image config and manifest and then rehashing them as well.

mwuertinger commented 7 years ago

This part of the OCI specification clearly says that gzipped layers are supported: https://github.com/opencontainers/image-spec/blob/master/layer.md

Therefore I think that this must be a bug or simply a lack of support on Docker's side.

For the time being I suggest to modify my PR to add a flag to acbuild begin allowing to disable layer compression, eg.: acbuild begin --build-mode=oci --layer-compression=false.

runcom commented 7 years ago

This is by Docker design, we could, however, support this in some way since we should have ways to do this. @mtrmac wdyt?

lucab commented 7 years ago

Just my 2¢, I would personally:

mwuertinger commented 7 years ago

I pushed a new commit which adds the flag as suggested by @lucab

mtrmac commented 7 years ago

This part of the OCI specification clearly says that gzipped layers are supported: https://github.com/opencontainers/image-spec/blob/master/layer.md

That part says that layer files may be gzipped, and the manifest may contain digests of gzipped layers.

It is fine, and I guess, by default, desirable, to gzip the layers.

The mismatch reported by moby, however, refers to layer references in the config.json, per https://github.com/opencontainers/image-spec/blob/master/config.md , which is explicitly defined to use the uncompressed form.

So, whatever is creating an “OCI” image where the config.json rootfs.diff_ids array contains digests of the compressed versions is doing that incorrectly.

AFAICT this is seems to be a fairly straightforward bug in containers/build, something like this:

(skopeo could be taught to correct the diff_ids values, (we have already written that ugly code for converting from schema1), but IMHO, if that should ever happen, that should be a special operation / option, not a default. There isn’t an obvious indication that the input image is OCI-noncompliant, and uncompressing the layers to compute the digests is fairly expensive, too wasteful to do that by default. Besides, skopeo is not the only possible consumer of OCI images, and other tools would likewise have to add similar workarounds; it seems to me much simpler not to generate the invalid images than to have a fixup in skopeo.)

mwuertinger commented 7 years ago

Thanks for the hint @mtrmac! I changed the code accordingly and renamed the fields to match the OCI spec. It works for me now. Please let me know what you think.

lucab commented 7 years ago

Thanks everybody for investigating this, and sorry for the initial misdirections. Current changes in this PR makes sense given the last hints. @mwuertinger mind squashing this in a single commit?

mwuertinger commented 7 years ago

@lucab thanks for the update. I squashed it into a single commit.

cgonyeo commented 7 years ago

This LGTM, I'll merge if @lucab agrees

mwuertinger commented 7 years ago

I just found a problem: the acbuild run command now tries to use the DiffID to locate the layer blob which doesn't work. I will look at it again tomorrow and push a fix.

mwuertinger commented 7 years ago

I fixed the acbuild run issue and I wrote an integration test which ensures that acbuild stays compatible with Docker: https://github.com/mwuertinger/acbuild-oci-integration-test I'd like to have a test like that in this repo but I'm unsure what your strategy is on that. Please let me know what you think.

mtrmac commented 7 years ago

I changed the code accordingly and renamed the fields to match the OCI spec. It works for me now. Please let me know what you think.

Thanks! The changes do look correct to me, though I am not familiar in this repo and I can't speak to impact on the rest of the code (like the acbuild run thing you have run into.)

lucab commented 7 years ago

@mwuertinger I think your integration test relies on too many moving parts to be directly fitting here (I know, that's the point of integration testing :). I guess we could instead have some functional tests to ensure that oci-mode + run/copy + write produce a valid image. Followups are welcome, but this PR is already ok as a bugfix itself.

OCI members may/will have dedicate integration testing at some point, and acbuild can just be plugged in there then.