Open bifurcation opened 6 months ago
First off, I'm entirely with you on the sadness regarding the DH/commutativity dependency, I've been dipping my toes into this problem for a while as well…
As far as I can tell though, your suggestion would break the requirement that third parties ought not to be able to enumerate messages stored on a server at any given point (also over time - #43 discusses this to a certain degree), by re-sharing CT_ac
as-is every time clients would poll for new messages. CT_ac
would have to be randomisable by a third-party (without changing the encapsulated shared secret) to make this work. I don't understand enough of ML-KEM's math around ciphertexts though so I asked a couple folks if they had ideas how to solve this, but no luck so far. Got any ideas?
PS: Cool work on the ml-kem
crate, I've been following (and using it) from afar :sparkles:
I'm confused about the enumeration point. It seems like in the current scheme, g^ab
is (a) stable per-message and (b) passed on unmodified S->SD->R
. So if CT_ac
having that property is bad, isn't the current scheme also bad in that way?
I'm also unclear on what you would be trying to achieve by rerandomizing. In the current scheme, a receiver without SK_c
learns only list of opaque random values (CT_ac, CT_abc)
. Repeated queries over time would reveal that some messages were no longer on the server (by opaque ID). In a rerandomized scheme, one could still learn the overall quantity of messages on the server over time. So the major thing that rerandomization buys you is that you can distinguish between the case where no messages are moving and the case where input and output are equal (both of which leave the stored message volume constant). Is that the idea here?
Hi Richard, as per the email exchange, thank you for taking a look into this, we really appreciate it. I think the issue is that this does not represent what we are trying to do originally.
DH-based:
S->SD: g^a, g^ab SD->R: g^ab, E(g^abc, msgid) R: (g^ab)^c => msgid
It should be instead:
S->SD: g^a, g^ab
SD->R: g^ac, E(g^abc, msgid)
R: (g^ac)^b => msgid
Where R knows the secret exponent b
, and c
is randomly generated per request per message.
As per the quantity of messages, we are always returning a fixed number (and hopefully mitigate for side channels there), which is the upper limit of the protocol in its current form (in the low thousands).
Ah, thanks @giulio, that makes more sense. I think in my notation, b
was held by the SD
and c
by R
, so just aligning your notation to that and elaborating a little:
S->SD: g^a, g^ac
SD: b = rand(), g^ab = (g^a)^b, g^abc = (g^ac)^b
SD->R: g^ab, E(g^abc, msgid)
R: g^abc = (g^ab)^c
=> msgid
While still retaining the message-fetching part not quantum resistant, https://github.com/freedomofpress/securedrop-protocol/issues/55 is a proposed version of the protocol using PQXDH.
It makes me sad to see things these days that depend on DH as opposed to KEM, since only KEMs. FWIW, I think you could refactor the clue scheme to be KEM-based, if you have a DeriveKeyPair function (as in RFC 9180).
DH-based:
KEM-based:
That shares the properties that (1) the clue is only intelligible by the intended recipient and random otherwise (2) the SecureDrop server can encrypt the msgid to the recipient without knowing the recipient's public key. (In principle, you could also just have msgid=SS_abc and save a symmetric encryption step.) Definitely possible I'm missing some other requirements, though.