scidsg / hushline

Hush Line connects whistleblowers with organizations and people who can help.
https://hushline.app
GNU Affero General Public License v3.0
77 stars 21 forks source link

Use UUIDs instead of sequential IDs for messages, to avoid information leakage #528

Open micahflee opened 2 months ago

micahflee commented 2 months ago

As it stands, anyone is able to determine how many messages are sent over a given Hush Line server by looking at the message IDs.

For example, I just submitted a test message to myself on tips.hushline.app and then looked at my inbox. Every message has a delete button. Here's the form for this message's delete button:

<form action="/delete_message/168" method="POST">

The message ID is 168, which means that's the number of messages that have been sent through this Hush Line server so far. (It would actually be pretty easy to automate submitting a test message to yourself every 24 hours to build a graph of a given Hush Line server's daily message traffic.)

To solve this problem, we can use universally unique identifiers (UUIDs). Python has a built-in UUID module:

>>> import uuid
>>> str(uuid.uuid4())
'5904c566-930e-4e1f-89c6-e094d4448cd1'

We can add a uuid field to the message table, and every message can get assigned a random UUID when it's created. Whenever messages are exposed to the user (like in the inbox template), it can use the UUID instead of the ID to reference them. So in the example above, the delete form could POST to /delete_message/5904c566-930e-4e1f-89c6-e094d4448cd1 instead.

Thanks @himori123 for reporting this.

brassy-endomorph commented 2 months ago

UUIDs should be created in postgres on insert by using DEFAULT uuid_generate_v4(), not in python IMO. This will lead to cleaner app code and fewer bugs. If the UUID is needed pre-insert, we can generate it on the app side, or we can do a db.flush() within a transaction (slower, but idk might make sense in some cases).

micahflee commented 2 months ago

You can do that?!

Also, do you know of a way to switch to UUIDs with a migration? This would be required to make this switch.

brassy-endomorph commented 2 months ago

Yeah, I always use that function in defaults and have for many years. At least 7 if not longer.

I think there has to be some manual work done to ensure that the PK lines up in FK fields in related tables, and that might require some python looping in the alembic migrations. It might be possible to have some slick update pure SQL we exec that does all of it (or at least I'm rather sure we can be slick about it and avoid a lot of back and forth between the app and PG for this). I'd have to test. This relates to #555 as it would let us be relatively certain that our migrations are doing the right thing.