freedomofpress / securedrop-protocol

Research and proof of concept to develop the next SecureDrop with end to end encryption.
GNU Affero General Public License v3.0
47 stars 1 forks source link

Bucketing proposal to drop the message limit #49

Open lsd-cat opened 6 months ago

lsd-cat commented 6 months ago

The PoC is currently not scalable: there is a set maximum of messages and above that the system is technically full. Furthermore, the current limit is probably lower that MAX_MESSAGES anyway: as @ayende correctly state here, https://github.com/freedomofpress/securedrop-protocol/issues/43 and attacker can just flood the system until MAX_MESSAGES to discover how many messages there are. The way to mitigate this could be to stop accepting news messages MAX_MESSAGES-rand(0,n) so that an attacker will not know the exact amount even in that case. However, it should be trivial for a server to detect a message flooding attack and notify its administrator.

Beside attack scenarios, here is a sample idea on how to possibly handle unlimited messages: 1) The sender buckets messages by bits of the recipient's public keys. 2) The sender buckets messages by bits of an agreed pre-shared and time-based key.

Example: A wants to send a first contact message to. B public key starts with 2 bits set to 01. A encrypt the message, attaches a plaintext secret kotpA, prepare the clue and send the clue and the ciphertext to /message/bucket/01.

B wants to fetch a message. B can fetch both bucket 01 and another random one, say bucket 11 (so requests to /fetch/bucket/01 and /fetch/bucket/11. B then discovers A message in bucket 01 and decrypts it, obtaining the plaintext and kotpA.

Now that B wants to reply to A, they encrypt the clue the ciphertext, attaching also a plaintext secret kotpB. B then calculates sha256(kotpA|current_week) and obtains an hash. B then takes the first bits of this hash, say 11 and send the message to A in that bucket.

A knows that if B has sent a reply, they would have used the bucket calculated using kotpA, so they fetch both bucket 11 and the bucket with the bits that corresponds to their public key.

This is just a general idea, with parameters that can be adjusted. Using a byte for bucketing, would already give 255 buckets, with thousands of messages each one. Of course, this byte is a partial leakage of the identity of the recipient: it would require to see if given the reply and OTP scheme it is still statistically significant.

This is probably not required for SecureDrop, but dumping here the idea since it could be a fun exercise.

ayende commented 6 months ago

How many recipients do you actually expect to have? Assume that this is run on each organization on its own. It is likely that you'll have just a few of those.

So any traffic there would tell who you are talking to, no?

lsd-cat commented 6 months ago

You are right, it's outside of the original requirements because we know that there are few participants and little traffic per organization, so we are fine with having the original upper limit and that's all. Any scalability scheme that trade-off some recipients information will need a lot of traffic and recipients to make sense in the first place. Maybe there is a way to have the server split up and scale only when needed though.