Should we include or only bind public keys in assertions

bwesterb commented 1 year ago

The size of assertions is dominated by the public key. Classical public keys are small: ~32/64 bytes for ECC, ~256/512 bytes for RSA. Post-quantum public keys are larger. 1312/1953B for Dilithium2/3, 897B for Falcon512. There are also some unbalanced (smaller sigs, but larger public keys) schemes on the horizon with even larger public keys: 5kB for Mayo_2, ≥500kB for UOV.

Instead of storing the public key in the assertion, we could store H(public key) [1]. The relying party would still need to send the public key. We could include it in the bikeshed certificate.

The downsides are:

It slightly complicates the protocol.
We send 32 bytes extra in the TLS connection.
We can't check for weak public keys.

[1] We might actually want to do H(domain sep || public key || claim_type || claims)

davidben commented 1 year ago

We don't necessarily need to send 32 bytes extra in TLS. It's a little gross, but the RP could reconstruct the assertion with the hashed value. (Arguably it's better to make them do that, so they don't accidentally forgot to check H(pubkey) matches.)

I kept thinking this would impede transparency somewhat, but I think that's just faulty intuition. If you know the set of expected pubkeys, you can check for an unexpected hash just as easily as an unexpected pubkey. And if you don't know the set, it's not like the unhashed pubkey is any more checkable (point 3 aside) than the hashed one.

And yeah if serving obligations are a problem for CAs and TSs, a 30x size decrease sounds like a good way to clear it! :-) Though we'd have to be crystal clear that any unexpected pubkey hash counts as an unauthorized certificate, of equal severity, whether monitors can produce a preimage or not. That is, a CA can't say "that's weird, but it's not a security issue because we promise no pubkey hashes to it". (Relatedly, #5.)

Another oddity: I think CAs typically check proof of possession of the private key, so they would need to see the preimage during issuance. We also expect CAs to have audit logs of everything they do. I think those together mean the CA must store the private key one way or another. If so, this would only offset serving costs, rather than storage costs. And if the CA's required to store it, they probably should still be able to produce it on demand, but I suppose that needn't necessarily be served online / mirrored by everyone.

@devonobrien for thoughts on this.

bwesterb commented 1 year ago

It's a little gross, but the RP could reconstruct the assertion with the hashed value.

Oh, yes, of course :facepalm:, that's the obvious thing to do.

any unexpected pubkey hash counts as an unauthorized certificate

Yes, definitely. (You could see this as a transformation of the underlying signature scheme: we're changing pk into H(pk) and sig into pk || sig.)

If so, this would only offset serving costs, rather than storage costs.

Aside from material cost, there is also the effort of keeping it running under stress or recovering from issues: it's easier to quickly fix something in a 100GB database, than a 3TB database.

bwesterb commented 1 year ago

It's a little gross, but the RP could reconstruct the assertion with the hashed value.

We should include the hash because otherwise the TS can't generate the Merkle tree.

davidben commented 1 year ago

Oh yeah the CA-to-TS format definitely needs to include the hash. And I suppose that's the main nuisance here. The ideal format from CA to TS and subscriber to RP now differ and we need to keep them all straight, or pay some costs.

bwesterb commented 1 year ago

We can't leave out the hash from the assertion as served by the CA to the TS, but we could leave out the hash from the. BikeshedCertificate. That would complicate its definition a bit: we'd need two different Assertion structs.

davidben / merkle-tree-certs

Should we include or only bind public keys in assertions #6