nlewo / nix2container

An archive-less dockerTools.buildImage implementation
Apache License 2.0
506 stars 45 forks source link

Pull image with only the API manifest hashes #76

Closed mikepurvis closed 1 year ago

mikepurvis commented 1 year ago

It's a bummer right now that pullImage requires giving the fixed output derivation hash, since anything that updates an image specified in Nix source (whether manually or with automation) has to pull the entire image in order to compute that hash.

Would it be possible to implement a pull in Nix that only requires the content hashes already in the image manifest supplied by a registry? A sketch of this would be:

Not sure if the second would have to be implemented from scratch or just using the skopeo JSON manifests that nix2container already uses for building— if the layers had to be unpacked anyway, then it would be maybe possible to hardlink the files from the final derivation to avoid paying the disk cost twice. A bunch of cp -Rl actions is pretty cheap relative to assembling another JSON and calling a separate tool.


Just to put a bit more flesh on these bones— here's how to list the tags for a particular image, the latest of which at time of writing is 251a921be086aa489705e31fa5bd59f2dadfa0824aa7f362728dfe264eb6a3d2:

https://hub.docker.com/v2/repositories/nixos/nix/tags/latest

The manifest and blob APIs require a bearer token:

export repo=nixos/nix
export token=$(curl -sSL "https://auth.docker.io/token?service=registry.docker.io&scope=repository:${repo}:pull" | jq --raw-output .token)

# Gets us the list of all the layers in the image specified by the digest
curl -H "Authorization: Bearer ${token}" "https://registry.hub.docker.com/v2/${repo}/manifests/sha256:251a921be086aa489705e31fa5bd59f2dadfa0824aa7f362728dfe264eb6a3d2"

# Pull a specific layer's archive
curl -H "Authorization: Bearer ${token}" "https://registry.hub.docker.com/v2/${repo}/blobs/sha256:7eec37a2649230139dc4534fd8b5cec45986588eee6bb30625c5b2bcf87d368c" -L --output layer.bin

We can validate that the layer matches the given content hash:

$ sha256sum layer.bin
7eec37a2649230139dc4534fd8b5cec45986588eee6bb30625c5b2bcf87d368c  layer.bin

And also verify that it's just a normal tarball of files:

$ tar -tvzf layer.bin
tar: Removing leading `/' from member names
dr-xr-xr-x root/root         0 1969-12-31 19:00 /nix/
dr-xr-xr-x root/root         0 1969-12-31 19:00 /nix/store/
dr-xr-xr-x root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/
dr-xr-xr-x root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/
-r--r--r-- root/root    134932 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlicommon-static.a
lrwxrwxrwx root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlicommon.so -> libbrotlicommon.so.1
lrwxrwxrwx root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlicommon.so.1 -> libbrotlicommon.so.1.0.9
-r-xr-xr-x root/root    142960 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlicommon.so.1.0.9
-r--r--r-- root/root     57752 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlidec-static.a
lrwxrwxrwx root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlidec.so -> libbrotlidec.so.1
lrwxrwxrwx root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlidec.so.1 -> libbrotlidec.so.1.0.9
-r-xr-xr-x root/root     55152 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlidec.so.1.0.9
-r--r--r-- root/root    741648 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlienc-static.a
lrwxrwxrwx root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlienc.so -> libbrotlienc.so.1
lrwxrwxrwx root/root         0 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlienc.so.1 -> libbrotlienc.so.1.0.9
-r-xr-xr-x root/root    664120 1969-12-31 19:00 /nix/store/9iy1ng7h1l6jdmjk157jra8n4hkrfdj1-brotli-1.0.9-lib/lib/libbrotlienc.so.1.0.9

So this definitely looks doable to download and unpack the layers into the Nix store just using manifest information. My main questions then would be—

If it's a new implementation, I'm assuming it would make most sense to just have Nix directly parse the manifest JSON, similar to how poetry2nix works. Then the manifest can be stored in the repo and it's trivial for users to update it with:

skopeo inspect docker://docker.io/nixos/nix@sha256:251a921be086aa489705e31fa5bd59f2dadfa0824aa7f362728dfe264eb6a3d2 --raw > manifest.json

Looks like the format of a nix2container.pullImage is really just a modified manifest pointing to the layer archives anyway, eg:

{
        "version": 1,
        "image-config": {},
        "layers": [
                {
                        "digest": "sha256:9123ac7c32f74759e6283f04dbf571f18246abe5bb2c779efcb32cd50f3ff13c",
                        "size": 0,
                        "diff_ids": "sha256:39db6acceed35328fae0746f9125ee85ea6e3600ed2c35b81fff757783b30209",
                        "mediatype": "application/vnd.oci.image.layer.v1.tar+gzip",
                        "layer-path": "/nix/store/bq9m85r2ylfx1fqmqgjxw9gb9c35nxlf-docker-image-alpine/9123ac7c32f74759e6283f04dbf571f18246abe5bb2c779efcb32cd50f3ff13c"
                }
        ],
        "arch": ""
}

So there's really no need to bother about actually unpacking anything— just pull the archives and make up a similar manifest file pointing at them in their individual store paths.

mikepurvis commented 1 year ago

Closing this in favour of the concrete proposal in the PR.