alephdata / aleph

Search and browse documents and data; find the people and companies you look for.
http://docs.aleph.occrp.org
MIT License
2k stars 267 forks source link

Hash API keys #3842

Open tillprochaska opened 1 month ago

tillprochaska commented 1 month ago

This is based on #3094. Review and merge #3094 first!


Aleph used to store user API keys as plaintext in the database. This commit changes that to store only a hash of the API key.

API keys are generated using the built-in secrets.token_urlsafe method which returns a random 256 bit token. In contrast to passwords, API keys are not provided by users, have a high entropy, and need to be validated on every request. It seems to be generally accepted that, given 256 bit tokens, salting or using an expensive key derivation functions isn't necessary. (But please challenge this!) For this reason, we’re storing an unsalted SHA-256 hash of the API key which also makes it easy to look up and verify a given API key.

I've added a separate column for the hashed API key rather than reusing the existing column. This allows us to batch-hash all existing plaintext keys without having to differentiate between keys that have already been hashed and those that haven't. Once all existing plaintext API keys have been hashed, the old api_key column can simply be dropped.

This is a breaking change. After deployment, admins need to run the aleph hash-plaintext-api-keys CLI command to hash legacy plaintext API keys. Alternatively, users can regenerate their API key.