sigstore / sigstore-python

A Sigstore client written in Python
https://pypi.org/p/sigstore
Other
227 stars 49 forks source link

Support DSSE-style enveloped signatures #628

Closed woodruffw closed 6 months ago

woodruffw commented 1 year ago

This is part of Sigstore bundle support: we currently only support "raw" signatures, while some users of Sigstore may chose to use enveloped DSSE-style signatures.

From protobuf-specs:

// An authenticated message of arbitrary type.
message Envelope {
  // Message to be signed. (In JSON, this is encoded as base64.)
  // REQUIRED.
  bytes payload = 1;

  // String unambiguously identifying how to interpret payload.
  // REQUIRED.
  string payloadType = 2;

  // Signature over:
  //     PAE(type, payload)
  // Where PAE is defined as:
  // PAE(type, payload) = "DSSEv1" + SP + LEN(type) + SP + type + SP + LEN(payload) + SP + payload
  // +               = concatenation
  // SP              = ASCII space [0x20]
  // "DSSEv1"        = ASCII [0x44, 0x53, 0x53, 0x45, 0x76, 0x31]
  // LEN(s)          = ASCII decimal encoding of the byte length of s, with no leading zeros
  // REQUIRED (length >= 1).
  repeated Signature signatures = 3;
}

message Signature {
  // Signature itself. (In JSON, this is encoded as base64.)
  // REQUIRED.
  bytes sig = 1;

  // *Unauthenticated* hint identifying which public key was used.
  // OPTIONAL.
  string keyid = 2;
}

...where payload is the to-be-signed attestation statement.

xref https://github.com/sigstore/fulcio/issues/1131

Work tracker:

woodruffw commented 1 year ago

xref https://github.com/sigstore/fulcio/issues/1131 for original motivating context.

woodruffw commented 1 year ago

Starting on this now -- we'll need it for uploading and retrieving build provenance attestations for Homebrew.

woodruffw commented 1 year ago

Thinking about this some more, I think the "right" way to do this is similar to what sigstore-js has done:

  1. We need a new sigstore-rekor-types (or similar) package that exposes Rekor's OpenAPI types as Python types
  2. We should use sigstore-rekor-types here to standardize our interactions with Rekor's REST API (maybe replacing our homespun client entirely?)
TomHennen commented 1 year ago

Will this be based on this work? (Apologies if I should have know that already, the change is spread out across a number of places)

woodruffw commented 1 year ago

Will this be based on this work? (Apologies if I should have know that already, the change is spread out across a number of places)

I haven't had a chance to look at that recently, so I'm not sure 🙂. I'll try and take another look this afternoon/evening.

As a roundabout way of answering: the Sigstore clients are generally producing/consuming Sigstore bundles as their top level "unit of work," so signing is likely to continue doing that. For verification, we'll want to support a Sigstore bundle containing a DSSE-formatted signature, but we may also support the inverse form (DSSE containing a Sigstore bundle).

(But all of that is entirely up in the air -- first I need to actually get the data models working, followed by adding client-side support for Rekor's DSSE type, etc.)

woodruffw commented 1 year ago

I've created https://github.com/trailofbits/sigstore-rekor-types to supply auto-generated models for Rekor's types.

jku commented 10 months ago

xref https://github.com/sigstore/fulcio/issues/1131 for original motivating context.

Could you expand on this and the use case in general -- I assume you want to create attestations over the same content but with different predicate data? I assume we want to implement this inside sigstore-python since rekor supports in-toto attestations, correct?

How does this work in the verifying side (e.g. how does the predicate content get verified)? Do you expect this to be done by some intoto verification system? I think cosign has something builtin (cosign verify-attestation --policy policy.rego ...)

Am I correct that this is more accurately "Support intoto attestation signing" than DSSE -- in-toto just happens to use DSSE? Non-intoto use cases don't make sense here, right?

jku commented 10 months ago

Will this be based on https://github.com/secure-systems-lab/dsse/pull/61? (Apologies if I should have know that already, the change is spread out across a number of places)

...

the Sigstore clients are generally producing/consuming Sigstore bundles as their top level "unit of work," so signing is likely to continue doing that. For verification, we'll want to support a Sigstore bundle containing a DSSE-formatted signature, but we may also support the inverse form (DSSE containing a Sigstore bundle).

I don't have a dog in the race but it feels like embedding DSSE within the bundle and making bundles embeddable in DSSE sound like competing proposals in practice (from an interoperability perspective at the very least). Has this been discussed anywhere?

woodruffw commented 10 months ago

Could you expand on this and the use case in general -- I assume you want to create attestations over the same content but with different predicate data? I assume we want to implement this inside sigstore-python since rekor supports in-toto attestations, correct?

Yep, exactly -- I originally opened sigstore#fulcio#1131 because I didn't really understand which "layer" those predicates (I think of them as additional "relying party" metadata) should appear at.

My understanding is that Rekor supports in-toto attestations in two forms: there are intoto entries (which are deprecated(?)/discouraged since they include arbitrary metadata in the log entry itself) and dsse entries, which are essentially hashedrekord entries but with a formally structured (in-toto attestation/statement) input format rather than arbitrary bytes.

How does this work in the verifying side (e.g. how does the predicate content get verified)? Do you expect this to be done by some intoto verification system? I think cosign has something builtin (cosign verify-attestation --policy policy.rego ...)

Yeah, something like that is what I was thinking. This connects to my larger conceptual concerns around DSSE/in-toto, i.e. that there's an unsolved "policy" layer that end users will need to graple with.

(This is part of why I'm only exposing this at the API level for now -- I think a user-friendly CLI for this is going to require a larger design effort/more thinking.)

Am I correct that this is more accurately "Support intoto attestation signing" than DSSE -- in-toto just happens to use DSSE? Non-intoto use cases don't make sense here, right?

I think it's accurate to say DSSE here, since the work (at least in #804) will produce and verify dsse entries, rather than intoto entries. Those DSSE entries can technically be arbitrary bytes, but so far I've set it up to only accept an in-toto statement (and I believe Rekor itself will reject the proposed entry if it isn't formatted as a statement).

(That being said, I'm low confidence on the terminology here.)

I don't have a dog in the race but it feels like embedding DSSE within the bundle and making bundles embeddable in DSSE sound like competing proposals in practice (from an interoperability perspective at the very least). Has this been discussed anywhere?

It also feels that way to me. I don't think it's been discussed anywhere directly, other than a (brief) discussion to confirm that doing this isn't a priori incompatible with either format.

I think it'd be really nice to have just one format here -- having the two be mutually nestable means more conditions/error states for implementing clients.

woodruffw commented 9 months ago

Signing is now in, via #804. I'll work on verification next.

woodruffw commented 9 months ago

For verification, I'm realizing that VerificationMaterials will also need to be removed, similar to how we removed SigningResult. Everything will just take a Bundle instead. I'll make that a separate PR.

laurentsimon commented 9 months ago

I'm interested in this feature! I've tested the signing part of the API already.

Any idea about the timeline for verification?

jku commented 9 months ago

For verification, I'm realizing that VerificationMaterials will also need to be removed, similar to how we removed SigningResult. Everything will just take a Bundle instead.

FWIW that matches 100% how we ended up using the API in securesystemslib SigstoreSigner: VerificationMaterials and SigningResult are not used in any way except as intermediate steps to/from Bundle.

woodruffw commented 9 months ago

Any idea about the timeline for verification?

I'll have sometime today and next week to look at it, but no hard promises 🙂

The main complexity here is removing VerificationMaterials, which will likely imply the need to remove --certificate and a bunch of the other legacy non-bundle CLI bits (or pre-transform them into a Bundle, although I'm not a gigantic fan of that).

laurentsimon commented 9 months ago

Any idea about the timeline for verification?

I'll have sometime today and next week to look at it, but no hard promises 🙂

The main complexity here is removing VerificationMaterials, which will likely imply the need to remove --certificate and a bunch of the other legacy non-bundle CLI bits (or pre-transform them into a Bundle, although I'm not a gigantic fan of that).

Thanks. No urgent rush, just want to have a rough idea. So in ~1 month it should be landed

woodruffw commented 9 months ago

Thanks. No urgent rush, just want to have a rough idea. So in ~1 month it should be landed

Yeah, I'll have a draft PR up well before that.

laurentsimon commented 9 months ago

Please tag me on the draft PR when you send it. Thank you!

woodruffw commented 8 months ago

Will do! Sorry, this got popped off the stack; I'll try and slice some hours for it this coming week.

mihaimaruseac commented 8 months ago

Can you also tag me, please? Interested in this for signing ML models (https://github.com/google/model-transparency)

woodruffw commented 8 months ago

Just merged https://github.com/sigstore/sigstore-python/pull/904, which gets the verification APIs into shape for DSSE sigs. I have one more churn/refactor change to work through before actually adding DSSE verification support, but once that's done it should be pretty straightforward.

woodruffw commented 8 months ago

After some discussion in the clients channel, I've come to the conclusion that my original design here (fitting DSSE into the current sign and verify APIs) is probably a bad idea: DSSE signing mostly fits, but verification of a Sigstore bundle containing a DSSE envelope is fundamentally different (since the bundle is fully self-contained, and isn't compared to an external "ground truth" like an input file). DSSE verification needs to return the extracted statement for subsequent policy verification, which also means that it doesn't cleanly fit into the current verify(...) -> VerificationResult API.

Rather than introduce user confusion by trying to mash the two together, I'm going to separate the DSSE counterparts into sign_dsse and verify_dsse, with appropriate signatures.

woodruffw commented 8 months ago

To round it out, here's what I'm thinking:

  1. sign and verify in the current API become sign_contents and verify_contents, with their signatures otherwise unchanged.
  2. sign_dsse and verify_dsse get added, with the following signatures:

    sign_dsse(self, stmt: dsse.Statment) -> Bundle
    verify_dsse(self, bundle: Bundle) -> dsse.Statement

    This will somewhat ossify the assumption that DSSE payloads are always in-toto Statements, which is maybe hasty. So it could be this instead:

    verify_dsse(self, bundle: Bundle) -> [PayloadType, bytes]

    ...but I don't have a clear intuition for whether this is preferable.

CC @jku @laurentsimon @mihaimaruseac for any opinions 🙂

jku commented 8 months ago

General idea makes sense to me: I was also wondering how the verification API can stay consistent -- since it likely can't, the sign APIs can/should be separated as well.

On your question: if adding alternatives to intoto statements is going to mean an API change in practice anyway (for verification you can hide the API change behind [payloadType, bytes] but realistically it's an API change for both signers and verifiers), then maybe we just make it simple and call this intoto signing or statement signing -- that way new alternatives are likely to have just their own new methods instead of changing API.

laurentsimon commented 8 months ago

LGTM as well to have separate APIs.

This will somewhat ossify the assumption that DSSE payloads are always in-toto Statements, which is maybe hasty.

I'm interested in being able to use DSSE with non-intoto statements, e.g. for model manifest signing https://github.com/google/model-transparency/issues/111 (intoto does not fit the use case very well). @mihaimaruseac wdut?

This may also allow us to sign other AI types like dataset's croissant in DSSE, without the need to store signatures separately, which means the signature would travel along the content - the flip side is that it's a breaking change for existing users of these data types

woodruffw commented 8 months ago

On your question: if adding alternatives to intoto statements is going to mean an API change in practice anyway (for verification you can hide the API change behind [payloadType, bytes] but realistically it's an API change for both signers and verifiers), then maybe we just make it simple and call this intoto signing or statement signing -- that way new alternatives are likely to have just their own new methods instead of changing API.

Good point. Given that @laurentsimon has a non-intoto use case, I think I'm going to borrow a technique from PyCA cryptography here and do the following:

mihaimaruseac commented 8 months ago

I think this makes sense. Higher level API for most usecases and lower one for users that want more flexibility and know what they're doing.

laurentsimon commented 7 months ago

Is there a chance this feature may be landed for OSS@NA? By "landed", I mean a PR and a non-official dev release for testing purposes. We'd like to demo the mode signing work. This is not blocking, so if it's not possible no worries.

woodruffw commented 7 months ago

Is there a chance this feature may be landed for OSS@NA? By "landed", I mean a PR and a non-official dev release for testing purposes. We'd like to demo the mode signing work. This is not blocking, so if it's not possible no worries.

Yes, but only if I can get someone's approving review on #937 and the subsequent PRs that'll need to be built on that 😅 -- I don't have the power to merge my own work, so I'll need someone else to having a relatively tight review cycle with me.

laurentsimon commented 7 months ago

Is there a chance this feature may be landed for OSS@NA? By "landed", I mean a PR and a non-official dev release for testing purposes. We'd like to demo the mode signing work. This is not blocking, so if it's not possible no worries.

Yes, but only if I can get someone's approving review on #937 and the subsequent PRs that'll need to be built on that 😅 -- I don't have the power to merge my own work, so I'll need someone else to having a relatively tight review cycle with me.

SG. Let's see if @jku is up for the challenge :)

jku commented 7 months ago

sure, working on it. 937 seems good so far, it's just a big patch.

woodruffw commented 7 months ago

Just to summarize/update here, here's the current stack:

woodruffw commented 7 months ago

@laurentsimon @mihaimaruseac: FYI: #962 contains the initial verify APIs for DSSE, and there's a sample script for you to try locally in the comments!

woodruffw commented 6 months ago

962 was the last piece of this, so closing!

mihaimaruseac commented 6 months ago

Thank you very much for all of this! This is great!

laurentsimon commented 6 months ago

Thank you so much! I think there's some limitations on what digest types are allowed in the subject. I'll check the merged PR and if still the case, I will create a tracking issue for it. Thanks again for landing this feature!

woodruffw commented 6 months ago

Thank you so much! I think there's some limitations on what digest types are allowed in the subject. I'll check the merged PR and if still the case, I will create a tracking issue for it. Thanks again for landing this feature!

Yeah, this wouldn't surprise me 🙂 -- I suspect that SHA2-512, etc. won't currently work.

laurentsimon commented 6 months ago

Thank you so much! I think there's some limitations on what digest types are allowed in the subject. I'll check the merged PR and if still the case, I will create a tracking issue for it. Thanks again for landing this feature!

Yeah, this wouldn't surprise me 🙂 -- I suspect that SHA2-512, etc. won't currently work.

In our case we use a custom merkle tree hash. That's fine for the demo zo