anchore / stereoscope

go library for processing container images and simulating a squash filesystem
Apache License 2.0
79 stars 43 forks source link

Support for image indexes with multiple manifests #175

Open matthewpi opened 1 year ago

matthewpi commented 1 year ago

What would you like to be added:

Currently stereoscope will throw an error when using image providers when they are passed an image index that contains references to multiple manifests. I'd like to request that this behaviour is changed (or a new feature is implemented) to support image indexes with references to multiple manifests.

Why is this needed:

Multi manifest image indexes are becoming increasingly more common, especially with the prominence of multiple supported architectures for containers (primarily amd64 and arm64). If stereoscope is passed one of these multi manifest indexes, it errors out with no ability to filter or select one of the manifests from the index (at least with the oci-dir provider).

My primary use-case for this change is to be able to use Syft to generate SBOMs for each manifest within a multi-arch oci-dir. While I'm unsure exactly how the SBOMs will be generated for multi-arch images, the changes I am requesting here seem to be a prerequisite to be able to support any type of multi-arch SBOM (or generating SBOMs per image manifest) when given a multi-manifest index.

Additional context:

I have already prototyped the code for supporting this and plan to open a PR but thought I would open an issue before doing so. My main concerns about making this change is to what extent is changing or breaking the existing API allowed.

My current implementation adds a new Index struct, a new IndexProvider interface with a ProvideIndex method so providers can optionally add support for multi-manifest indexes, however this feels very awkward for two reasons.

The first is providers will still error out if you call Provide with a multi-manifest index, this is due to the existing API not being changed at all and Provide still only supporting a single image.

Secondly, there would now another entire code path to support multi-manifest indexes. While this code path does still fully support single-manifest indexes (or just single images if the format doesn't have the concept of multiple manifests), it does mean users of this library will need to implement support for a different API and new users may be confused about using the Index vs non-index functions.

sophiewigmore commented 1 year ago

Hey y'all! My team has a similar desire for support for multiple manifests as we have begun shipping multi-arch images in our project. @matthewpi are you still looking into this?

matthewpi commented 1 year ago

Hey y'all! My team has a similar desire for support for multiple manifests as we have begun shipping multi-arch images in our project. @matthewpi are you still looking into this?

I was but had to take time to work on some other things, the PR I opened works correctly but I am still unsure about it's API design. Ignoring any potential changes to the API design, the code works and is ready to be merged. I will see about looking over the code myself later today and ensuring it is ready for review.

Following that, we would need to get any consumers of the library to update and change their code-paths to support the new image index provider and types.

sophiewigmore commented 1 year ago

Cool, thanks for the update. In the meantime we've found a workaround of pushing our multi-arch OCI archives to a local docker registry, and then generating the SBOM via syft on the local registry images which seems to work.

kzantow commented 1 year ago

This is related to the analogous issue in Syft: https://github.com/anchore/syft/issues/1683

spiffcs commented 1 month ago

👋 Sorry that this issue has been stale for so long!

We came upon it when doing some review of issues during our live stream: https://www.youtube.com/channel/UC3HczVqyiAqz1aNxBIMS3pQ

Here is the plan that we came up with - We want a solution to an oci index with multiple manifests that is narrowly scoped to the same behavior as the other stereoscope providers.

This means two things:

1) We focus the solution to stereoscope providing a single answer from the manifest. Multiple answers from the index to generate multiple sbom by syft side is not in scope for this change.

Below is how the docker daemon provider works for stereoscope when trying to select the correct https://github.com/anchore/stereoscope/blob/e6d086e8bef5fab4fcfbd60c9a759c4cb229decf/pkg/image/containerd/daemon_provider.go#L238-L270

The OCI provider currently selects position 0 from the manifest list: https://github.com/anchore/stereoscope/blob/e6d086e8bef5fab4fcfbd60c9a759c4cb229decf/pkg/image/oci/directory_provider.go#L62-L66

It should be updated to an API where syft can ask for a certain platform/arch from the index and expect to get a single *image.Image answer.

We did not have time to conclude if this input from syft should be required, or if some other default stereoscope behavior should exist to try and "guess" which image to respond with.