davidben / merkle-tree-certs

Other
9 stars 2 forks source link

Revocation #41

Open bwesterb opened 1 year ago

bwesterb commented 1 year ago

[[TODO: Is it worth defining an API for Merkle Tree CAs to publish a revocation list? That would allow automatically populating CRLite and CRLSets. Maybe that's a separate document.]]

bwesterb commented 1 year ago

It's tempting to include revocations in the batches, by having either (Abridged)Assertions or Revocations as leafs. So we'd have, say:

enum {
    assertion(0),
    revocation(1),
    (2^8-1)
} LeafType;

struct {
    LeafType leaf_type;
    opaque leaf<0..2^24-1>;
} Leaf;

struct {
    uint32 batch_number;
    uint64 index;
} Revocation;

There are downsides:

An alternative would be to have parallel batches/tree/signed windows where leafs are revocations instead of assertions. That only has the revocation delay as downside.

Finally, a bespoke HTTP interface could be created that would allow for quicker revocation. That'll force us to use a clock, which we don't really need so far (see #17.)

Other questions are:

  1. Do we need to include extra information, such as revocation reason? I'd say the full list of reasons is excessive. Perhaps just a "key compromised" trit (yes, no, unknown), would be ok.
devonobrien commented 1 year ago

I'm still not confident that building in CA-initiated revocation is the best or most preferable design for e.g. 10-14 day certificates. I'd rather include explicit certificate lifetime restrictions than integrate this kind of revocation (solutions like browser-made revocation lists can still be delivered out of band). At a policy level, the CABF currently appears poised to remove the need for revocation information certificates for <= 10 day X.509 certs, which aligns with the fastest reliable revocation given CRL and OCSP caching.

Regarding your downsides

bwesterb commented 1 year ago

Even if every certificate in a batch were revoked, it should only increase proof size by ~one extra hash, right?

Yes. I was thinking of the case where you want to revoke million of certificates over multiple batches.

bwesterb commented 1 month ago

Some more thoughts on this.

  1. Another downside to using the main batches for revocations: a client is only presented with a proof of inclusion — not of non-inclusion of later revocations. The transparency service would have to build something on top anyway. Together with the existing downside that it couples issuance and revocation latency, I'd say this is dead on arrival.

  2. That today revocation is slow, means that we don't need revocation in MTC to match the status quo. I wouldn't oppose adding back in better revocation later on — as David suggested a later document.

  3. CRLite and variants solve a strictly harder problem than we're dealing with here: the input is a hash, and there the universe of possible hashes is unknown to the client (but known to the service generating the CRLite filters thanks to CT.) On the other hand, here, the universe is known: we know the number of assertions in a batch. So the task is to encode a subset (of k revoked indices) of {0, ..., n}. We can reach the theoretical minimum of lg (n choose k) for a random subset within 5% using fancy coding, and ~20% using simpler methods, beating CRLite which is about ~100%. Still, if we revoke 16 million out of 725 million, we need ~15MB, which is a lot.

  4. We'll need to understand how much will be revoked with shorter lifetimes. I'll look into this. I also do not believe the current high rate of revocation is security critical. Could we have clients fetch the optional compressed revocation lists directly from the CAs, which incentivises them to keep them short?

davidben commented 1 month ago

I suspect revocation can always be added later, yeah. Ultimately, all we need to revoke individual entries in a batch is:

But yeah I agree with Devon that the revocation story should broadly be that the assertions are short-lived enough anyway. I wouldn't want to close the door on being able to push out faster revocations, but I think the above means we can safely defer this to later.

bwesterb commented 1 month ago

I had a quick look at revocation behaviour of Let's Encrypt users to get some sense of what we're dealing with. July 18th (2024) I downloaded Let's Encrypt's CRLs (E1, E5, R4, R10 and R11). They contain a total of 523,178 revocations. I looked up all certificates with a notAfter between July 18th 00:00:00 and August 26th 14:44:59 in Certificate Transparency (that's a bit more than 39 days). I found 166 million certificates. Of those 306,321 (0.18%) appeared in the CRLs. For each of those revoked certificates I determined how many days into its lifetime it was revoked. This is the cumulative distribution function for that:

Screenshot 2024-07-23 at 13 31 55

Note:

Screenshot 2024-07-23 at 14 32 12

Rough preliminary conclusion: at least 60% of Let's Encrypt revocations don't seem security related. The remaining revocations are close to uniformly distributed within a certificate's lifetime.

So, shorter lived certificates makes the initial download of revocations smaller. It makes the daily updates much worse, if we do not filter out the revocations on the first day and renewals. If we do, then the daily updates should be unchanged. Similar conclusions should hold for CRLite.

bwesterb commented 1 month ago

Pulled remaining certs from CT. The fraction of first-day revocations is even higher than with the preliminary data: more than half the Let's Encrypt revocations happen within the first day.

Screenshot 2024-07-25 at 23 50 08
bwesterb commented 1 month ago

Encoding the full list of 512,652 Let's Encrypt revocations (out of 382,584,264 ) requires 1.14MB using CRLite. With Rice coding it requires 707.3kB. Using Huffman codes for the bitlength, it's 706.5kB. The information theoretic lower bound is 703.9kB.