sigstore / cosign

Code signing and transparency for containers and binaries
Apache License 2.0
4.47k stars 546 forks source link

Spec: Attestations & keyless mode for binaries #1743

Open asraa opened 2 years ago

asraa commented 2 years ago

Description

This is a proposal to discuss the general Sigstore story and functionality around binaries and blobs. Many package managers are planning on adopting Sigstore tooling, especially using keyless mode as they spin up their own identity providers.

Problem How should devs signing packages and creating attestations with sigstore keyless distribute (1) the binary (2) the signed provenance or signatures and (3) cosign bundles, signing certs, etc.

Ideas Right now, devs need to provide all the material needed separately to tools that can do verification. But cosign already defines a format derived from OCI spec to attach a .att file describing the verification material needed. However, for blobs, we don't have a digest "reference" to retrieve the blob. Otherwise, this would be a great "manifest" to pass in to verifiers that has a specification.

If we can include attestations inline optionally in the annotation of each layer (e.g. dev.sigstore.cosign/envelope, then we can supply only this manifest file instead of a collection of sigstore "stuff" and the attestation. For basic signatures, we already include dev.cosignproject.cosign/signature in the manifest.

This also has the benefit of supporting multiple attestations and sigs.

cc @laurentsimon

dlorenc commented 2 years ago

+1 on speccing this out! I think we had another issue somewhere for it, but I'm not sure where.

asraa commented 2 years ago

It's also somewhat relevant for that issue (also can't track it down anywhere) on having wrapper envelopes around DSSE to provide PKI. Provide the cosign-defined manifest: includes all of the above, plus the offline verification bundle.

lumjjb commented 2 years ago

Would it be a stretch to include SBOM in the discussion as well? Probably will be an orthogonal problem later on, and may be helpful to have common mechanisms to influence SBOM provenance as well.

dlorenc commented 2 years ago

SBOM is somewhat orthogonal - it's a thing that can be attested just like a binary blob. I think what Asra proposed here would just work with SBOMs as well.

laurentsimon commented 2 years ago

Would it be a stretch to include SBOM in the discussion as well? Probably will be an orthogonal problem later on, and may be helpful to have common mechanisms to influence SBOM provenance as well.

+1. SBOMs are just one type of attestations, and they should be discoverable/accessible from the main .att like other attestations (if we go that route).

Adding some links I was provided and may be relevant for the discussion: https://github.com/in-toto/attestation/blob/main/spec/predicates/spdx.md https://github.com/in-toto/attestation/blob/main/spec/bundle.md

laurentsimon commented 2 years ago

ah, Dan beat me to it :-)

laurentsimon commented 2 years ago

Something else I wanted to add. While you may be able to include SBOM inline, I see a reason for not doing this in practice. A lot of users will start generating SBOMs that are not part of a provenance, and users typically copy the file next to an artifact as artifact.sbom - or dockerimage:sha256-xxxx.sbom.

Users may want to have easy access to these human-readable SBOMs (or other scans) without going thru the pain of parsing attestations or using a tool to do it.

In addition, SBOMs will gain traction on their own, so we ought to not break practices and tools that build upon them today. So I think being able to support "reference" to SBOM (or other artifacts) is useful, even in the context of file storage. This way when provenance is more widely adopted, we can simply refer to sboms as a separate file, instead of asking users to adopt yet another set of tools/practices to read them... unless they want to verify the signature. (I agree we want them to do that, but there are cases where I just want to download an SBOM from a GH release and use that)

The .att could contain all the references... but I don't know if there's a way to indicate that the reference is not on a registry.

Not saying we should follow what I said, just an additional thing to think about. There's clearly value to have everything inline as well :-)

mlieberman85 commented 2 years ago

Something else I wanted to add. While you may be able to include SBOM inline, I see a reason for not doing this in practice. A lot of users will start generating SBOMs that are not part of a provenance, and users typically copy the file next to an artifact as artifact.sbom - or dockerimage:sha256-xxxx.sbom.

Users may want to have easy access to these human-readable SBOMs (or other scans) without going thru the pain of parsing attestations or using a tool to do it.

There are also a few other reasons from my perspective:

What I have done is the past is I view the SBOM like any other artifact. I attest to that artifact. The areas I think that could use some work are around how to follow that attestation to

tiziano88 commented 2 years ago

I'm not 100% sure it falls under this same feature request, but I would like to be able to use a hybrid between cosign attest and cosign sign-blob; I want to provide an arbitrary in-toto statement, have it signed by cosign (e.g. via keyless signing), and upload to rekor in the same way as rekor upload --type=intoto would.

Something similar to what this e2e test is doing: https://github.com/sigstore/rekor/blob/ed3b98f749436002ff7b17c4de16ac4b6afbebe3/tests/e2e_test.go#L443 , but with keyless signing. Not sure whether this falls under rekor or cosign though.

mlieberman85 commented 2 years ago

I'm not 100% sure it falls under this same feature request, but I would like to be able to use a hybrid between cosign attest and cosign sign-blob; I want to provide an arbitrary in-toto statement, have it signed by cosign (e.g. via keyless signing), and upload to rekor in the same way as rekor upload --type=intoto would.

Something similar to what this e2e test is doing: https://github.com/sigstore/rekor/blob/ed3b98f749436002ff7b17c4de16ac4b6afbebe3/tests/e2e_test.go#L443 , but with keyless signing. Not sure whether this falls under rekor or cosign though.

So, does it not allow you to do that now? I think you can just use a custom attestation e.g. cosign attest --type=custom ... I might be misunderstanding though.

tiziano88 commented 2 years ago

Thanks, I think your suggestion would indeed at least allow specifying a custom in toto attestation format, which is a good starting point.

But I wasn't able to do that for binary blobs (only for containers), nor in keyless mode.

dlorenc commented 2 years ago

I think there's some confusion in the upload flow today that needs to get resolved as we remove the COSIGN_EXPERIMENTAL variable. Right now that implies both uploading to the log and keyless. Uploads need to be controlled separately from keyless. I think we do this right today, but it's confusing and misleading.

asraa commented 2 years ago

Thanks, I think your suggestion would indeed at least allow specifying a custom in toto attestation format, which is a good starting point.

But I wasn't able to do that for binary blobs (only for containers), nor in keyless mode.

Right, currently you can't sign and create an attestation to a blob that you haven't uploaded to an OCI registry. We've had to implement this outside of sigstore/cosign.

Just an update on the format for an on-disk bundle for binary blob signing: I think we're converging on adding info inline to an OCI manifest image JSON format, or a custom in-toto predicate with materials referencing objects (e.g. attestations), and material like signing certs. I'll talk more at community meeting

asraa commented 2 years ago

Update: After talking to lots of folks about this problem, I think here's the path forward: Tackle these problems separately:

  1. Where do we store or attach verification material for attestations (esp relevant for keyless mode)?
    • Implement something like ITE 7. Add x509 info and timestamp fields to store signing certs and rekor bundles respectively to the intoto golang implementation
    • Add support for keyless signing predicates and outputting the DSSE to disk w/ the cert and rekor info inside per above
  2. Where do we discover or reference attestations stored elsewhere? (e.g. large SBOMs)
    • TBD, still needs thought

@lumjjb @laurentsimon @MarkLodato

tiziano88 commented 2 years ago
  1. Where do we discover or reference attestations stored elsewhere? (e.g. large SBOMs)

    • TBD, still needs thought

IMO at this level we should just refer to attestations (or, in fact, literally anything else, including artifacts) by their hash (in particular, never URLs or file names). Then mapping a hash to where to find the thing with that hash at a particular point in space/time becomes a separate problem that can be handled independently (plug for https://github.com/google/ent ). i.e. everything should be content-addressed, instead of location-addressed.

laurentsimon commented 2 years ago

mapping only by content hash is not always reliable, because it's easy to get a builder to output any hash. So I could attach a predicate to an openssl build claiming its SBOM has plenty of unpatched dependencies, for example. You need some indexing using the source (hash commit), right? ahh, you seem to be proposing something similar https://github.com/sigstore/rekor/issues/792

laurentsimon commented 2 years ago

another possible feature to discuss: use of rekor UUID in the attestation. This could be a custom-client field not included in the final payload signed, but used as a hint to lookup the actual provenance content. The benefit of this is that it avoids hitting Redis (which has been having issues recently), and only requires the Trillian log which appears to be more reliable since it's been running in the CT for several years. There's a question about canonicalization of the payload (after removing) the custom field. This may create issues...?

laurentsimon commented 2 years ago

@asraa told me offline that the certificate would be enough to query Trillian directly, since Trillian_entry=content+certificate, and we can therefore query Trillian without hitting Redis. Please ignore my suggestion above :-)