google / go-containerregistry

Go library and CLIs for working with container registries
Apache License 2.0
3.09k stars 538 forks source link

Include manifest in the v1 tarball? Or latest OCI image-spec? #651

Open deitch opened 4 years ago

deitch commented 4 years ago

Per #255 and the reference PR, support for v1 tarball was added, which steps in this direction. However, it does not include multiple blobs, or the original manifest or index, which the spec at https://github.com/opencontainers/image-spec/blob/master/image-layout.md does do. This breaks the confirmation of software supply chain a little, makes it hard to see the entire piece.

For now, I have been hacking the v1 tar file by adding the original manifest after pulling, but I would far prefer to follow the spec.

jonjohnsonjr commented 4 years ago

There's a separate package for the OCI image layout: https://godoc.org/github.com/google/go-containerregistry/pkg/v1/layout

The tarball packages are for dealing with the docker save style image tarballs.

deitch commented 4 years ago

I had used that for on disk layout, I was thinking more as a tarball. I guess I always could lay it out to disk and then tar it up, or turn it into a stream.

Still, be nice to be able to save an image to an OCI layout in a tarball. Or did I miss that in the godoc?

If it makes sense, I can try and find a little time to write a PR for it, so v1.layout can work against a Path or a tarball.

docker save style image tarballs

Got it. And to be compatible, you need the manifest.json, which is distinct from how the image layout does it, although I think everything else would work if the paths in the manifest.jaon were correct.

A few questions:

deitch commented 4 years ago

if we took the v1 layout on disk, tarred it up, and added a manifest.json (with the proper paths under blob/sha256/ of course), would it not also be compatible with docker save?

I can confirm that it would. However, it would be a bit misleading, as it would have all of the images under the index in it, and the manifest.json would point just at one. However, you could have a universal format by including just the one you wanted.

jonjohnsonjr commented 4 years ago

If it makes sense, I can try and find a little time to write a PR for it, so v1.layout can work against a Path or a tarball.

I'd be interested to see what this looks like. The main reason I chose not to do this was for efficiency. Avoiding serialization to a tarball avoids a lot of (accidental) quadratic complexity in common access patterns.

I guess I always could lay it out to disk and then tar it up, or turn it into a stream.

This is what I had in mind for how it would be used, but I can understand how that's less than ideal.

doesn’t the docker save format require the repositories file?

As far as I can tell, no. Docker will load images just fine without a repositories file. There's a pretty good explanation of the two different formats here: https://github.com/containers/skopeo/issues/425

if we took the v1 layout on disk, tarred it up, and added a manifest.json (with the proper paths under blob/sha256/ of course), would it not also be compatible with docker save?

This is an interesting idea! I can't believe I haven't thought of it. I don't think we'd always want to do this, just because I'd like to avoid the clutter. How would you feel about adding a method to layout.Path for writing arbitrary files? We already have WriteBlob, so WriteFile wouldn't be too unreasonable. Converting an index.json to a manifest.json wouldn't be too hard, and possibly something we could expose in a layout or tarball package... but then we'd lose the RepoTags info, so maybe tarball should expose something like:

func GenerateManifest(refToImage map[name.Reference]v1.Image) (Manifest, error)

I haven't thought this through all the way but there are definitely some interesting ideas here.

can we include the registry manifest? Any docker save or load tools ignore any file that doesn’t fit with its manifest, so it doesn’t hurt, but is helpful for full provenance

I don't have a huge problem with this other than lack of precedent. I think I prefer the approach of making an image layout into a compatible tarball.

deitch commented 4 years ago

I will put something together in the next week or two (depending on client meetings and travel... I am not always master of my own time :-) ).

deitch commented 4 years ago

As far as I can tell, no. Docker will load images just fine without a repositories file. There's a pretty good explanation of the two different formats here: containers/skopeo#425

Yeah, that was a good read. I liked the point about this stuff being mostly reverse-engineered. :-)

You are right about repositories not being required under docker load. All of the necessary data that was in repositories is perfectly available in manifest.json. The RepoTags provide repository and tags, and the last hash in Layers provides the necessary hash key. I must be working with tooling that just is old and unmaintained. I am hoping to get rid of it, but until then, I can just use the above to generate the repositories data and add it to the tar file (reluctantly).

deitch commented 4 years ago

How would you feel about adding a method to layout.Path for writing arbitrary files?

I will do this right away.

This is an interesting idea! I can't believe I haven't thought of it.

I ran into several interesting problems related to doing this:

  1. The v1.layout is of the entire repo, i.e. it doesn't resolve anything from an index to a single manifest. This means that the layout dir (or tar file) is potentially many times larger than the one for a single arch, which v1.tarball and legacy.tarball are.
  2. The v1.layout is of the entire repo as above. On the other hand, docker load/save is always a single image (not an index). So if I saved to a v1.layout as a tarball, I could include everything, but for it to work with docker load, I would need to have a manifest.json that indicates which single image to load up. By definition, that would make it incompatible with any other image. It starts to make sense to be able to pack it up as a tarball, but adding the manifest.json - which points to one of (possibly) several images in the archive as related by the index.json - gets confusing.
  3. manifest.json always has RepoTags so docker load knows where it came from. What would we do here? In my sample work, I just included a CLI option to add the name of the source, the kind of thing docker save would add. Does that translate here?

I don't have a huge problem with this other than lack of precedent. I think I prefer the approach of making an image layout into a compatible tarball.

Makes sense, but we have the issues above. How would we handle them sanely?

deitch commented 4 years ago

How would you feel about adding a method to layout.Path for writing arbitrary files?

Turns out you already had one, but it was private. Made it public and added a comment, see #665

jonjohnsonjr commented 4 years ago

The v1.layout is of the entire repo, i.e. it doesn't resolve anything from an index to a single manifest. This means that the layout dir (or tar file) is potentially many times larger than the one for a single arch, which v1.tarball and legacy.tarball are.

Can you do the index -> manifest resolution before writing to the layout?

The v1.layout is of the entire repo as above. On the other hand, docker load/save is always a single image (not an index).

The tarballs can contain multiple images, e.g.:

$ docker save alpine busybox | tar -Oxf - manifest.json | jq .
[
  {
    "Config": "6d5fcfe5ff170471fcc3c8b47631d6d71202a1fd44cf3c147e50c8de21cf0648.json",
    "RepoTags": [
      "busybox:latest"
    ],
    "Layers": [
      "91981208d0d3ad4156b273a78f6cc9de3f032e4a26c5f0f5e39de5abdae8d5b7/layer.tar"
    ]
  },
  {
    "Config": "af341ccd2df8b0e2d67cf8dd32e087bfda4e5756ebd1c76bbf3efa0dc246590e.json",
    "RepoTags": [
      "alpine:3.10"
    ],
    "Layers": [
      "71dba1fabbde4baabcdebcde4895d3f3887e388b09cef162f8159cf7daffa1b8/layer.tar"
    ]
  },
  {
    "Config": "e7d92cdc71feacf90708cb59182d0df1b911f8ae022d29e8e95d75ca6a99776a.json",
    "RepoTags": [
      "alpine:latest"
    ],
    "Layers": [
      "39cb81dcd06e3d4e2b813f56b72da567696fa9a59b652bd477615b31af969239/layer.tar"
    ]
  }
]

So if I saved to a v1.layout as a tarball, I could include everything, but for it to work with docker load, I would need to have a manifest.json that indicates which single image to load up. By definition, that would make it incompatible with any other image.

I'm not sure I understand this.

It starts to make sense to be able to pack it up as a tarball, but adding the manifest.json - which points to one of (possibly) several images in the archive as related by the index.json - gets confusing.

Yeah I can see a mismatch between the index.json and the manifest.json being pretty confusing... you could have an entry for every image or do the index -> manifest resolution before writing anything.

manifest.json always has RepoTags so docker load knows where it came from. What would we do here? In my sample work, I just included a CLI option to add the name of the source, the kind of thing docker save would add. Does that translate here?

There's org.opencontainers.image.ref.name, also see the discussion in https://github.com/opencontainers/image-spec/issues/796 -- it looks like containerd can do something similar to what you're proposing (write a manifest.json file to an OCI image layout).

deitch commented 4 years ago

Can you do the index -> manifest resolution before writing to the layout?

Do you mean at the time we write the layout to disk? Or at the time we convert it into a tar file?

The tarballs can contain multiple images, e.g.:

Very good point. But in the case of a single image ref (which is what we are doing) and an index, is it likely one would want to docker load (or any other tool that uses the format) the images for amd64 and arm64 and etc.? Or is it likely they just want the specific one for their arch/os combo?

I would be concerned that someone might do:

  1. Save to layout
  2. Convert to tar (or single step)
  3. Ship it somewhere
  4. docker load
  5. End up with lots of useless images, since there is only one in the image index that I really wanted

That having been said, binfmt does let people run images from other archs locally, so it is not quite as simple as that.

Yeah I can see a mismatch between the index.json and the manifest.json being pretty confusing... you could have an entry for every image or do the index -> manifest resolution before writing anything.

We would have to clearly define at what stage that is supposed to happen. For one option, I could see a process like this:

  1. save to layout (includes everything and image index). Optionally, I might select just one or several os/arch out of the index, or possibly even have a way to say, "pick mine", which is what docker pull does.
  2. tar up layout (includes everything and image index). Optionally, I might select just one or several os/arch out of the index, or possibly even have a way to say, "pick mine", like above.
  3. Create a subset from the tar layout. This is just the same as the "select just one or several" above, but operating on the tar file

As soon as you select one or several, you are moving from a layout/tar that is driven by index.json towards one driven by manifest.json.

you could have an entry for every image

That might be an issue. The index itself was generated from a single tag, e.g. docker.io/library/alpine:3.10. This has an entry for each image resolvable via additional data: os+arch. But RepoTags in manifest.json needs something loadable, i.e. post-resolution. So which one does the one and only docker.io/library/alpine:3.10 in RepoTags point to? The linux+arm64? linux+amd64? linux+ppc64le?

All together, I really like the layout format, and it even would work as a tarfile, if we had a manifest.json which pointed to one or more hashes, but no more than one per tag. As long as we have multiple per tag, we appear to be stuck. But as soon as it is no more than one image per tag, we are free to generate a manifest.json. That means a filtering process.

jonjohnsonjr commented 4 years ago

Do you mean at the time we write the layout to disk? Or at the time we convert it into a tar file?

At the time you're writing the layout to disk. Presumably there's some code that fetches an index from somewhere (registry?), then writes that index to disk. Instead of writing the whole index, you could reduce it down the image you care about (based on whatever platform you're targeting, or other criteria) and write a singleton index.json instead of the whole thing.

End up with lots of useless images, since there is only one in the image index that I really wanted

Yeah this is definitely a problem if you're naively saving the whole manifest list to an index.json, but if you have knowledge of your target platform ahead of time, you could omit the other unused images.

We would have to clearly define at what stage that is supposed to happen.

I think it's application specific. We do something like this in pkg/v1/remote if you ask for an image but what you're pulling is an index, we find an appropriate image by platform.

It also probably depends on what clients are expecting to see. E.g. for something you intend to docker load later, I would probably only include the single image you want docker to load.

There are a bunch of different ways to convert a manifest list to an OCI image layout... enumerated the ones I think make sense here (to make it easier to discuss):

image

(dashed outline == digest referenced, but image is not written to disk)

I'm not sure what qualities you're trying to preserve, though. It might help to talk more concretely about what you're doing?

deitch commented 4 years ago

I think we are converging on the same place. v1.layout as it stands now, plus the following:

These can be mixed and matched, e.g. I might save just one arch in v1.layout on disk, or all of them in a tarball, or just one in a tarball with a manifest.json.

I originally thought that cases A & C wouldn't work, as manifest.json now points to 2 distinct images, but you are right, it does work. It is an array, so you can have multiple. I am not sure what consumers of it will do if the RepoTags for one are identical to those of another, but one step at a time, I guess.

I can try and get us a PR for the above, if we are in agreement. I would do them one at a time.

The open issue I see is that v1.layout doesn't work unless the root is an image index. If it is a straight manifest, it fails. Shouldn't it support pulling a single image as well as an index? See layout.Write() which only accepts a v1.ImageIndex.

That shouldn't stop us, though.

jonjohnsonjr commented 4 years ago

option to save the v1.layout on disk (exists) or in a tarball (new) ... I can try and get us a PR for the above, if we are in agreement. I would do them one at a time.

I'm interested to see what a tarball implementation would look like!

add a manifest.json so that the v1.layout is loadable

Looks like containerd includes it by default but has an option to skip it. I think I'd prefer to do the inverse, where we don't include it by default but have an ability to add it.

  • an ability to filter down from the entire thing to one or a few images, based on a provided list or the current platform. These are cases B, D, E
  • apply the filter multiple times, e.g. on download to first dir or tarball, or from tarball to another tarball, or dir to another dir, etc.

I don't think this belongs in the layout package, but we could definitely make it easier to do. I like to think of this library as a collection of sources (top) and sinks (bottom), and this filtering stuff is generic enough that it could live in the middle part of this:

image

If we somehow exposed parts of this logic as generic helper functions, I'd be happy, but I haven't sat down and thought through a good API for that yet. We could probably take some design inspiration from what containerd is doing (I haven't looked into it though). My first thought would be to build something that can walk a graph of Descriptors if you give it a v1.Image or v1.ImageIndex, and provide some helpful functions to match by e.g. platform.

You can implement most of this yourself like we do in remote by appending the child manifests to an empty.Index and writing that to the layout.

The open issue I see is that v1.layout doesn't work unless the root is an image index. If it is a straight manifest, it fails. Shouldn't it support pulling a single image as well as an index?

From my perspective, it already does, it's just a little bit more janky, see below...

See layout.Write() which only accepts a v1.ImageIndex.

In my head, a layout is just a collection of arbitrary artifacts, which maps directly to an image index. That collection can be of size one, so I don't think it makes sense to support writing a v1.Image, since you can easily do either:

layout.Write(empty.Index)

to initialize it, then append a single image, or:

layout.Write(mutate.AppendManifests(empty.Index, img))

to create a singleton index. It's also a lot easier on the read path to be able to assume index.json is always an image index.

deitch commented 4 years ago

I'm interested to see what a tarball implementation would look like!

Will get something in.

deitch commented 4 years ago

prefer to do the inverse, where we don't include it by default but have an ability to add it.

Understood. I tend to agree. I still don't understand how we can add it when there is more than one image for a tag, though.

then append a single image to create a singleton index

That would work as a standalone, but loses the provenance again. If I go to the tag that referenced this, and get the hash, then the hash matches the single image inside, not the index. Theoretically I could have some "if hash of single image manifest matches but not that of the index, go straight for that", but that is messy, and I can see some leaks on that,

I like to think of this library as a collection of sources (top) and sinks (bottom), and this filtering stuff is generic enough that it could live in the middle part of this

I can see that. I like it actually. I sometimes do conversions (my own strange form) in the ocidist library I have that I use for CLI manipulation of images (built entirely around this, of course).

jonjohnsonjr commented 4 years ago

I still don't understand how we can add it when there is more than one image for a tag, though.

I think this is one of the limitations of the tarball format, so we're not going to find a great solution. My understanding is that the docker client doesn't really "know" about manifest lists in the underlying data structures, so we are going to have to hack around it to get the right behavior.

This is more or less why tarball.MultiWrite has the signature it does -- to ensure uniqueness of tags within a tarball, and why tarball doesn't support writing an image index.

That would work as a standalone, but loses the provenance again. If I go to the tag that referenced this, and get the hash, then the hash matches the single image inside, not the index. Theoretically I could have some "if hash of single image manifest matches but not that of the index, go straight for that", but that is messy, and I can see some leaks on that,

Yeah this is painful and different tools have different behavior when they encounter an index.

I think what you really want is B from this comment. It keeps the provenance information available so you can access it with your own tools, but it only writes the data that docker actually needs in order to load the image. I think layout exposes enough stuff to achieve this, but it may be a bit clunky.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

RedHawker commented 2 years ago

Is there a plan to work on this further, or should we just expect to have to extract tar files containing the OCI structure in order to use crane on them?

jonjohnsonjr commented 2 years ago

@RedHawker what are you trying to do, specifically?

RedHawker commented 2 years ago

@jonjohnsonjr

tar up the OCI folder structure in order to bundle up images for air-gapped storage and sharing -- and being able to support multi-architecture images up front.

It seems like using docker save is fine when you want to do 1 image, but not when you have multi-arch versions of that same image.

I'm not an expert, but from my initial testing, trying to achieve this with just docker save isn't possible, especially since you can't docker save a docker manifest.