in-toto / attestation

in-toto Attestation Framework
Other
249 stars 68 forks source link

Add a ResourceDescriptor predicate #179

Open marcelamelara opened 1 year ago

marcelamelara commented 1 year ago

There are some use cases that need to communicate that a document or artifact exists and don't necessarily need to reference it from within some pre-existing in-toto statement (per https://github.com/in-toto/attestation/issues/117#issuecomment-1333815831).

The ResourceDescriptor is designed for describing and pointing to other artifacts within an in-toto attestation, so it can be used to implement this attestation predicate. The main usage of this predicate type is to have a signature over the referenced document or artifact without having to sign the entire document directly.

Tasks:

adityasaky commented 1 year ago

I think this may overlap with #124. I am a bit worried about fragmentation though.

marcelamelara commented 1 year ago

I think this may overlap with #124. I am a bit worried about fragmentation though.

Can you elaborate what you mean? I'm not immediately seeing the overlap.

adityasaky commented 1 year ago

So the other thread is about a "source" attestation recording artifact sources. Each source is then going to be a resource descriptor. As it's proposed, I think we're essentially talking there about the link predicate. But if we do want to carve out a subset predicate that describes sources, cutting other information like environment and so on out, I think we end up with 1+ resource descriptors. This is for just the one RD instance. The question is how different are all of these predicate options and should be we collect them as one predicate type? If we have a single RD type and a source predicate type, I imagine either would be usable in single-source scenarios.

marcelamelara commented 1 year ago

Ah, thanks for clarifying. I guess there are two schemas in that issue, and you're right that the schema proposed by @trishankatdatadog and this RD predicate do have significant overlap. I was looking at the schema that @colek42 had posted, and that one contains a lot more info than just a list of RDs. I tend to agree that if source and RD predicates will mostly be the same, we should develop a single predicate that can meet the needs of both use cases.

I should also add that the motivation for this RD predicate was attesting to an SBOM without including the full SBOM in the attestation, but I can also imagine using a SCAI predicate for this where the attribute field is "HAS_SBOM" and the evidence field contains an RD to the SBOM. So, I don't think it's immediately clear if the super generic RD predicate adds much in this use case.

trishankatdatadog commented 1 year ago

Please let me know how we decide to record source trees (my biggest use case).

P.S. The link to the ResourceDescriptor seems to be broken: this seems to be the correct one.

adityasaky commented 1 year ago

There's also https://github.com/testifysec/witness/blob/main/docs/attestors/material.md to consider. cc @colek42 @mikhailswift

colek42 commented 1 year ago

We are actually moving away from this container. Fred is working on omniTrail. OmniTrail generates hashes for all the subtrees. This should reduce search space for many matching problems.

On Fri, Jun 2, 2023 at 10:26 PM Aditya Sirish @.***> wrote:

There's also https://github.com/testifysec/witness/blob/main/docs/attestors/material.md to consider. cc @colek42 https://github.com/colek42 @mikhailswift https://github.com/mikhailswift

— Reply to this email directly, view it on GitHub https://github.com/in-toto/attestation/issues/179#issuecomment-1574605529, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSTWVKVWL2OFPYUSMUD7MDXJKVGNANCNFSM6AAAAAAWLFDEZI . You are receiving this because you were mentioned.Message ID: @.***>

TomHennen commented 9 months ago

Would something like this work?

{
  "_type": "https://in-toto.io/Statement/v1",
  "subject": [
    {
      "name": "foo.jar",
      "digest": {"sha256": "fe4fe40ac7250263c5dbe1cf3138912f3f416140aa248637a60d65fe22c47da4"}
    }
  ],
  // Predicate:
  "predicateType": "https://in-toto.io/attestation/reference/v0.1",
  "predicate": {
    "referrer": {
      "id": "http://example.com/sbom_generator"
    },
    "references": [ // Array of ResourceDescriptors
      {
        "digest": "abc123",
        "downloadLocation": "http://example.com/sbom_storage/abc123.spdx.json",
        "mediaType": ...
        ...
      }
    ]
  }
...
}   
TomHennen commented 9 months ago

Would it make any sense to rename this issue to 'Add a reference predicate' with the goal of "allowing an attestor to point to out-of-band data" or something like that?

(At the very least that's what we're looking for, not sure if that conflicts with any other goals)

mdeicas commented 4 months ago

I’m happy to create the proposal for the new predicate type. I’d like to come to agreement beforehand, though, if the use case (referring to and signing out-of-band metadata) is best served by a new predicate or if it can be handled by SCAI, as @marcelamelara mentioned. Using SCAI would involve defining a new attribute for each type of referred metadata.

SCAI could technically be used, but I’m not sure if it is the best fit. My main concern is where the SCAI attributes would be defined, or even if there is a place to do so. Beyond that, the SCAI specification is a larger prerequisite and is more geared towards functional attributes of artifacts, and not all metadata can be considered “functional”.

On the other hand, defining the new predicate type does lead to some duplication on how to address the use case, given that SCAI could technically be used.

lumjjb commented 4 months ago

To add another data point, we presented (https://sched.co/1YeQW) at kubecon eu this year about such a predicate that we used for SBOM, which seems to echo the exact use case requested here. We use something similar to what @TomHennen has proposed here.

I think this is slightly different framing from the witness/provenance materials, and resembling more of an intoto link predicate (although the link predicate is a bit generic, so having a defined predicate will help the consumption/adoption by being opinionated on how fields should be used).

i can see where folks are coming from and indeed there is "encodability" overlap with the different predicates, but I agree with sentiment that the intent here is different. The fields required for supporting documents vs source code attestations vs security statements are different, and i believe it is helpful to be intentional with defining each of them. Else, we may see issues around there being too many optional fields (which also have different use based on context), and that can cause confusion. I'll use SPDX as an example (since i am a contributor and thus an offender), where a "Package" can refer to many things (sources, application bundles) and has many optional fields depending on what it represents (which 3.0 does try and solve by providing a bit more specificity in subclassing, but still suffers from legacy convention).