in-toto / attestation

in-toto Attestation Framework
Other
233 stars 59 forks source link

Guidance on `subject[*].name` uniqueness with Image Indexes #348

Open lcarva opened 5 months ago

lcarva commented 5 months ago

I have a case where I have an OCI Image Index (aka Manifest List) which references two platform-specifix OCI Image Manifests. In total, there are three image references. The digest of each image is obviously unique.

I am generating an attestation, of type SLSA Provenance, that captures how those images are built. I want to include all three image references in the subject because 1) it accurately represents that these images were built through the same "pipeline"; and 2) I can use cosign to attach the same attestation to each of the three image references, allowing either to be verified independently.

For the subject[*].name, it seems reasonable to me to use the OCI repository, e.g. registry.local/namespace/repo. Given that a tag is not a requirement, there are cases where it is not available. A common use case is to tag the OCI Image Index, but not the OCI Image Manifests. With this approach, the subject[*].name for my three image references all have the same name. This poses a problem because the spec states the subject[*].name:

MUST be ... unique within subject.

I could include the digest in the name but that seems redundant since that's what subject[*].digest is for.

Any guidance would be greatly appreciated!

marcelamelara commented 4 months ago

PTAL @in-toto/attestation-maintainers

TomHennen commented 4 months ago

What distinguishes these three images? They are different, but what's different about them? Is it the architecture they support?

You mention the name might reasonably be registry.local/namespace/repo but is there some additional information after that in practice? E.g. registry.local/namespace/repo/foo, registry.local/namespace/repo/bar, registry.local/namespace/repo/baz?

lcarva commented 4 months ago

What distinguishes these three images? They are different, but what's different about them? Is it the architecture they support?

Yes. It's the platform. For example, one of the images is for linux/amd64 while another is for linux/arm64.

You mention the name might reasonably be registry.local/namespace/repo but is there some additional information after that in practice? E.g. registry.local/namespace/repo/foo, registry.local/namespace/repo/bar, registry.local/namespace/repo/baz?

For multi-arch images they usually all live in the same repository. The Image Index references them via digest not a full image reference. I don't know if it's even possible to reference an Image Manifest from an Image Index across repositories. If it is, it is likely a non-standard implementation specific feature of certain registries.

TomHennen commented 4 months ago

Do you happen to know the platform? Could you put the platform in the name field? That would seem to be the most helpful for users? Alternatively, how much do you need to distinguish them using name?

lcarva commented 4 months ago

It's possible to inspect an Image Index and an Image Manifest to pull additional information from it. The platform, however, is an optional attribute. I have seen, recent, cases where the platform is completely omitted because it was not relevant.

I do have concerns with requiring SLSA Provenance generators to perform these additional queries. It will quickly overwhelm some existing systems. For example, instances of Tekton Chains (for which I maintain and use extensively) would require additional resources.

It's also important to note that I could have a process that builds two Image Manifests for the same platform, for whatever reason. I don't necessarily need to tag these (although that would be unusual). This means the only attribute that distinguishes them is the digest.

TomHennen commented 3 months ago

What's your ideal solution? Having two different entries in subject with the same name? They'd still only be distinguished by digest but I guess users would know what 'repo' they go to?

That's sort of the use case for [resourceUri](https://slsa.dev/spec/v1.0/verification_summary#fields:~:text=the%20verification%20occurred.-,resourceUri,-string%20(ResourceURI) in the VSA.

The reason, I think, provenance doesn't cover this is that we typically (and perhaps incorrectly) assumed that the builder doesn't actually know the final destination of the artifact. E.g. yes, the builder might build the container image in a registry, but is that the final location or a staging location and it will eventually be promoted elsewhere? Since the VSA is computed when it's published to a specific location it would know the name.

That being said... I don't know how much the uniqueness requirement actually matters in the statement layer.

lcarva commented 3 months ago

What's your ideal solution? Having two different entries in subject with the same name? They'd still only be distinguished by digest but I guess users would know what 'repo' they go to?

I can see a couple of solutions.

One would be to just remove the requirement for the name to be unique. I don't know how much value that actually provides, but I may not have the full context. When I verify the SLSA Provenance, I do not rely on the name. The value is purely informative in the use cases I've dealt with. The digest is the attribute to match.

Another possible solution would be to stop using the OCI repo as the name. Instead, SLSA Provenance generators like Tekton Chains could generate a unique name, e.g. something that includes the Task/Pipeline name and the corresponding result name (build-container-IMAGES-0, build-container-IMAGES-1, ko-FOO-IMAGE). Although this approach does not require a change in the spec, I do think that it obscures the subjects, making them less informative.

TomHennen commented 3 months ago

So, another option is to just leave 'name' blank? It's not required.

lcarva commented 3 months ago

Good point!

I do think there's value in setting the OCI repository name somewhere when dealing with container images. What would be a good place for that? Maybe the uri field? Example:

{
  "uri": "oci://registry.local/namespace/repo@sha256:abc...",
  "digest": {
    "sha256": "abc..."
  }
}

A bit repetitive to include the digest twice, but otherwise it seems reasonable. WDYT?

TomHennen commented 3 months ago

Seems very reasonable. I also think you can do this without any changes to the spec?

SantiagoTorres commented 3 months ago

fwiw, Name can be a PURL, as of ITE-4: https://github.com/in-toto/ITE/tree/master/ITE/4. You don't need to change the name to 'uri' but rather treat is as an opaque descriptor, or a URI if it follows the right scheme. In theory you can use a GUN as per Docker/OCI, or a PURL, or really anything else.

ETA: in practice you could refer to the tag, and keep the has in the digest bit. Instead of:

{
  "uri": "oci://registry.local/namespace/repo@sha256:abc...",
  "digest": {
    "sha256": "abc..."
  }
}

You could:

{
  "name": "oci://registry.local/namespace/repo:my-tag",
  "digest": {
    "sha256": "abc..."
  }
}

The reason this is valuable is that it still allows you to avoid confusion-type attacks (i.e., libraries will check the hash on the digest but not on the name, and vice-versa), and would still allow you to associate different attestations to the same element.

TomHennen commented 3 months ago

fwiw, Name can be a PURL, as of ITE-4: https://github.com/in-toto/ITE/tree/master/ITE/4. You don't need to change the name to 'uri' but rather treat is as an opaque descriptor, or a URI if it follows the right scheme. In theory you can use a GUN as per Docker/OCI, or a PURL, or really anything else.

Right! I suppose now that the example includes the digest, there's no longer any issue with the uniqueness requirement, so this could go in name now too.

SantiagoTorres commented 3 months ago

there's no longer any issue with the uniqueness requirement, so this could go in name now too.

Yup, this is also true!

lcarva commented 3 months ago

Regarding oci://registry.local/namespace/repo:my-tag, a tag for an Image Manifest is optional, especially if the Image Manifest is referenced by an Image Index. If we omit the tag, then the name will not be unique.

SantiagoTorres commented 3 months ago

Regarding oci://registry.local/namespace/repo:my-tag, a tag for an Image Manifest is optional, especially if the Image Manifest is referenced by an Image Index. If we omit the tag, then the name will not be unique.

At that point I wonder it'd be easier to then lean on the descriptor spec from OCI and use that instead?

(all in all, I'm not arguing against the oci://whatever@sha256:whateverelse, I'm just trying to prod at the idea and see what other options are there :))

lcarva commented 3 months ago

At that point I wonder it'd be easier to then lean on the descriptor spec from OCI and use that instead?

I'm not sure what you mean. Can you clarify?