sigstore / rekor

Software Supply Chain Transparency Log
https://sigstore.dev
Apache License 2.0
894 stars 164 forks source link

Checkpoint spec compliance: Incorrect key ID for non-ECDSA keys #2062

Open haydentherapper opened 7 months ago

haydentherapper commented 7 months ago

Background

https://github.com/C2SP/C2SP/blob/main/signed-note.md defines the specification for a signed note, which is the format of a checkpoint. Note that this specification was only recently created. When checkpoints were added to Rekor, we followed the code for Go's checksum db note format, which is the signed note format - docs, code

The signature is the following format:

— <key name> base64(32-bit key ID || signature)

where key ID is the following format, roughly the truncated hash of the key name, signature scheme, and public key:

key ID = SHA-256(key name || 0x0A || signature type || public key)[:4]

Go's checksum db note code only defines a verifier for ed25519 keys (code), again aligning with the spec, where signature type = 0x01. It does not support ecdsa keys.

For historical reasons (citation needed) and as noted in the spec, ecdsa key IDs are the truncated hash of the DER encoded public key in SPKI format (code, with the truncation happening here).

Issue

When Rekor generates a signed note, it generates the key ID as a truncated DER encoded public key hash regardless of signature type. ecdsa is spec compliant, but ed25519 and RSA is not. The impact of this is if we were to switch to ed25519 or RSA signing keys, Rekor checkpoints would not be verifiable by omniwitness, armored witness or any other upcoming witness implementations.

The other issue is that the c2sp spec doesn't define support for RSA. There is an issue tracking this request, but general consensus is a lack of interest in supporting RSA due to additional signature support for witnesses, RSA vulnerability concerns and key/signature size. For public instances that could join a witness network, we should strongly recommend using ecdsa or ed25519 keys. For private instances, RSA can still be supported using the 0xff identifier type as per the spec.

Fixes

Given the last time I made a change to the checkpoint format, dropping the timestamp, it impacted clients, I want to make sure we thoughtfully roll this out.

Clients will need to add support for parsing signed notes following the spec, particularly computing the key ID differently for ed25519 and RSA keys. If you want to avoid any chance of breakage for private deployments that are currently using these key types, you could fallback to the DER encoded public key ID.

However, I don't know if clients were using the key ID to verify checkpoints. If the client was simply dropping the 32-bit key ID from base64(32-bit key ID || signature) before verifying the signature, this change seems quite safe to do. For example, for sigstore-go (which uses the Rekor library), we're doing just that - Base64 is the signature without the key ID (Hash). Technically this isn't following the spec, as the key ID should be compared against a known key. That doesn't seem like a requirement for Sigstore though, more for witnesses, since we have a way of distributing trusted keys. (Edit: this is not true for other clients, so we will need client changes)

For server side changes, given the public instance has only used ecdsa, we could fix the format for ed25519 and RSA signing keys without any impact to clients. If we want to do this in a non-breaking way, we would need to introduce the new checkpoint format in another response field for the /log API, though I don't like that approach as much as it'll complicate client logic for fetching a checkpoint and prolong this fix.

haydentherapper commented 7 months ago

cc @bdehamer @loosebazooka @woodruffw (also @codysoyland, but this'll be handled for Go with the fix in Rekor, since sigstore-go uses Rekor's verifier)

Sorry for the long wall of text! The tl;dr is that I want to change the checkpoint key ID format for ed25519 and RSA keys. If your client is simply dropping the key ID, which is the prefix for the checkpoint signature, then this change is really just a no-op for your client. If you are parsing the key ID, then you could need to update to construct the key ID differently based on the signature type.

Let me know if this is going to be an issue and if we should go with a non-breaking slower rollout. This is not something we have to rush out, just something to fix before we have more public logs.

@woodruffw: For sigstore-python, it looks like you use the log ID (which is the sha256 hash of the DER encoded SPKI pub key) in https://github.com/sigstore/sigstore-python/blob/main/sigstore/_internal/rekor/checkpoint.py#L217-L219, and then compare here - https://github.com/sigstore/sigstore-python/blob/main/sigstore/_internal/rekor/checkpoint.py#L173-L174. so this would need to be changed

@bdehamer: For sigstore-js, https://github.com/sigstore/sigstore-js/blob/main/packages/verify/src/timestamp/checkpoint.ts#L91-L93 - looks like you do something similar, where you find the correct key based on the key ID/key hint and compare it to the log ID. So this would need to be updated

@loosebazooka - For sigstore-java, looks like you are comparing key IDs in https://github.com/sigstore/sigstore-java/blob/9ec514d10544f93083471f23c44c828b2ba3f9f4/sigstore-java/src/main/java/dev/sigstore/rekor/client/RekorVerifier.java#L158-L166, so this would need to be updated.

bdehamer commented 7 months ago

As @haydentherapper already noted, sigstore-js is parsing the key ID from the signature and using that to identify the appropriate key. My vote is definitely for a VERY slow rollout of any change which is going to break the current implementation.

The last change to the checkpoint format caused a lot of grief in npm land and I'm afraid that if we drop another breaking change again so soon after the last one we run the risk of package developers deciding that it's not worth the hassle.

haydentherapper commented 7 months ago

Sorry about that rollout, we had assumed the other content in the signed note was considered optional.

The good news with this change is that there would be no changes to metadata produced from the public instance, since it uses ecdsa and there would be no changes there. I am unaware of any other public logs.

To make sure we don't break anyone, let's do the client changes (I can file bugs in each repo to track the change), and after a few months, we can roll out the change here. We'll also document in Rekor that an ecdsa key should be preferred and must be chosen if you will host a public log. How does that sound?

loosebazooka commented 7 months ago

Wasn't there also something about using a different style of logid for rekor? I can't seem to remember details, but there was no guarantee it was going to continue to be a hash of the public key like CT.

woodruffw commented 7 months ago

Summarizing to make sure I understand:

  1. For non-ECDSA keys, the key ID is defined as SHA-256(key name || 0x0A || signature type || public key)[:4], where public key is DER(SPKI(raw public key))
  2. For ECDSA keys, the key ID is not defined, but Rekor chooses to use SHA-256(public key)[:4]
  3. sigstore-python currently only expects ECDSA-based tlogs, and so does (2) unconditionally

Is that correct?

haydentherapper commented 7 months ago

Wasn't there also something about using a different style of logid for rekor? I can't seem to remember details, but there was no guarantee it was going to continue to be a hash of the public key like CT.

Yea, the CT v2 RFC defined log IDs as OIDs rather than a hash of a public key. The idea being that you might want to reuse a public key across logs but have each log have its own identity. I don't think we'll move away from the log ID being the hash though as most tools expect this, so instead we'll rotate the key the next time we create a new log.

For non-ECDSA keys, the key ID is defined as SHA-256(key name || 0x0A || signature type || public key)[:4], where public key is DER(SPKI(raw public key)) For ECDSA keys, the key ID is not defined, but Rekor chooses to use SHA-256(public key)[:4]

Unfortunately, reading more, the public key encoding is a bit messy.

For ECDSA, it is defined in the spec. It wasn't Rekor that decided the different format for ecdsa, I believe it was either RFC 6962 or some of the original witness work, and the spec didn't want to break existing logs using ecdsa. Rekor's key ID is currently SHA256(DER(SPKI(raw pubkey))) - code

sigstore-python currently only expects ECDSA-based tlogs, and so does (2) unconditionally

If you aren't handling RSA or ed25519, then this is even better, as it wouldn't be a breaking client change.

haydentherapper commented 7 months ago

I just reread what I wrote, and it's kinda messy. so tldr

RSA is up for debate if anyone disagrees, since the spec doesn't define RSA keys.

Edit: Also, consider we will add PQ keys at some point, and by the spec, we'll need to reuse 0xff. If you want to do something else, let's decide now.

woodruffw commented 7 months ago

Wow, what a mess 😄

So, to round it off: clients that want to handle key IDs correctly will need to:

  1. Stop assuming that everything is ECDSA
  2. Propagate the key type from its representation in TrustedRoot.tlogs[].public_key
  3. Use the key type to determine the key ID algorithm to apply

Did I get that right? If so, that'll be a slightly involved patch to sigstore-python, but not too bad 🙂

RSA is up for debate if anyone disagrees, since the spec doesn't define RSA keys.

Please direct me to another issue if this is the wrong place, but: is there a demonstrated (public or private) desire for RSA-based tlogs? Given that RSA sigs are much larger and that RSA signing is generally slower, I was under the impression that it wasn't commonly used for tlogs (despite RFC 6962 allowing it).

With that being said, if RSA support is already decided, I have no objection to that key ID format.

haydentherapper commented 7 months ago

Stop assuming that everything is ECDSA

Correct, particularly adding support for ed25519.

Propagate the key type from its representation in TrustedRoot.tlogs[].public_key Use the key type to determine the key ID algorithm to apply

Correct. Note that at some point, we'll also want to handle 0x04 co-signed checkpoints, signed with ed25519 as per the spec. So the key ID gives you both the signature algorithm and the structure that you're verifying, a checkpoint vs witnessed checkpoint. But that's forward looking

Please direct me to another issue if this is the wrong place, but: is there a demonstrated (public or private) desire for RSA-based tlogs? Given that RSA sigs are much larger and that RSA signing is generally slower, I was under the impression that it wasn't commonly used for tlogs (despite RFC 6962 allowing it).

Actually the opposite, RSA is not preferred. I brought this up in https://github.com/C2SP/C2SP/issues/63, and the consensus was a) concerns about RSA vulns particularly for PKCS#1v1.5, b) key/sig size, c) another alg to support. I did point out RFC 6962 allows for it, but since RFC 6962 doesn't have support for checkpoints and C2SP is focused on improving Certificate Transparency, they're fine going off-spec.

So if we want to say no RSA for Rekor, I'm OK with that. Rekor does support it currently so we might get some pushback from private instances, so I'd want to ask around. But if we're already in the state where most clients are just supporting ecdsa, then it's not really a breaking change.

loosebazooka commented 7 months ago

I vote no rsa support unless it's defined in spec, we'd just be setting ourselves up for future incompatibility

bdehamer commented 7 months ago

I was just looking through the implementation in sigstore-js trying to understand what may need to change.

Currently, it grabs the first four bytes of the checkpoint signature and uses that to find the correct key from the TrustedRoot by looking at the first four bytes of the listed key IDs.

Image

At no point does it attempt to calculate the key ID.

Would this change if Rekor were to use an ed25519 key? Or could we assume that the key ID listed in the TrustedRoot would always reflect the appropriate format?

loosebazooka commented 7 months ago

Interesting, yeah so are logId and keyHint going to use the same transformation over the key? Is it safe to use logId? Or should it always be calculated?

haydentherapper commented 7 months ago

@bdehamer, that's a good point. For context, the reason we chose the log ID to be the SHA-256 hash of the DER encoded SPKI public key was to follow RFC 6962. Fulcio's CT log already set the log ID to be that, so we followed the same practice for Rekor. To one of @loosebazooka's questions earlier on, the log ID can be whatever we choose it to be. It probably shouldn't be the key, because otherwise you are forced to rotate the key with every log sharding to maintain unique IDs. But we were trying to follow an existing related spec.

For ed25519, we can either:

Impacting the SET verification is what concerns me the most. I believe there are assumptions that the log ID always is SHA256(DER(SPKI(pubkey))) in most clients. I'd lean towards the second, making this change only for checkpoints.

Tangential, for RSA, I'm planning to continue to follow what we're doing for ECDSA for computing the key ID. I don't want to drop RSA support entirely in case someone is using it and I think this means existing client implementations will continue to work without any changes. I'll just discourage RSA for public logs, and we'll make no guarantees that clients implement RSA support. Lemme know if anyone disagrees.

haydentherapper commented 7 months ago

Also, there is another option, which is simply to state that publicly witnessed Rekor logs must use ecdsa.

haydentherapper commented 7 months ago

@loosebazooka and I were chatting today about this, and came up with a good compromise that avoids clients needing to know this computation - let's include the checkpoint key ID (the truncated hash) as an optional field in the trust root TransparencyLogInstance. Clients can then treat the checkpoint key ID as an arbitrary string rather than a constructed identity.

I would vote that we make it optional to avoid making this a breaking change to the trust root. If the checkpoint key ID is not set, then the client should assume checkpoint key ID == log ID (which is the case for ecdsa and RSA). For new ed25519 logs, the checkpoint key ID must be set.

What do you think @bdehamer @woodruffw?

woodruffw commented 7 months ago

I like that idea! That makes things pretty simple on the clients side 🙂

bdehamer commented 7 months ago

Sounds great, love it!