Open toddatpurism opened 8 years ago
If the data is decryptable with only a password, it's not strongly encrypted. Anyone who is able to access the encrypted data can use a password cracker to eventually decrypt it. A strong password and using many rounds can slow this down, but not prevent it.
It would be better to return a gpg encrypted file. But then we need to do gpg secret key management on the client side. Including backing up the gpg secret key in some way, which cannot use this service. Perhaps to a USB key. But what if it gets lost or breaks? The user's laptop would have a copy too, but perhaps both break at once, and then they're SOL.
How about a hybrid approach?
Storage:
Retrieval:
(Note that steps 1-4 are only needed when the client doesn't have the encryption key locally cached which it normally would.)
The server does not let anyone access eshard1. It only provides the correct decrypted shard1 when presented with the correct pw1. It can rate limit to prevent brute forcing.
Since pw2 is never sent to the server, the server can only get at the whole data by brute forcing eshard2. (It will also need to brute force eshard1 unless key recovery has been performed and it knows pw1.)
The encryption method should be such that decrypting with the wrong password generates random data. This makes brute-forcing quite difficult because each shard is itself an unstructured, high-entropy piece of data. Makes it hard to know when the right password has been guessed. An attacker would have to brute-force all necessary shards together, combine them, and check if the result is a valid gpg keyring.
Also, there could be more than 2 shards. Another shard could be stored on eg, a USB key, or a server run by a different entity. Then even if one server manages to guess a password, it would not have enough information to reconstruct the encryption key.
It would be good to have purism run one server, and have two other servers run by two independent, trusted third parties (like riseup.net or the EFF).
Then, 6 shards could be generated, and only 4 be needed to reconstruct the encryption key. Each server stores 2 shards. So any 1 server could go down without causing a disruption. And 2 of the servers would need to collude to crack the key.
Background reading:
A Taxonomy for Key Recovery Encryption Systems http://faculty.nps.edu/dedennin/publications/TaxonomyKeyRecovery.htm
Verifiable Partial Key Escrow http://groups.csail.mit.edu/cis/pubs/shafi/1997-ccs.pdf
It would be good for there to be no direct way for servers to correlate related shards. If the server doesn't know which shards to combine, it can't target particular ones for brute forcing.
Ways the server could draw correlations include:
The client can also do chaffing, by storing additional junk shards. If the key storage service is usable (ie, not too slow) for storing large enough files to be used to store the gpg-encrypted data files, then mixing those up with the gpg key shard files could also provide useful chaffing, as long as all files are chunked to the same size.
How to generate names for the shards? Add a pw0 generated from the user's passphrase, and then:
hash(pw0 + username + shardN)
This can be attacked: The server can try to brute-force the pw0 of a user, and if they manage to generate a hash that they've seen, they now know that a shard belongs to that user and can proceed to brute forcing that shard.
It's important that knowing pw0 not make it easier to brute force pw1..6.
I've written up a more detailed design at https://joeyh.name/code/keysafe/
Ability to store encrypted encryption keys to be able to easily setup new client software.
Something like gpg encrypted file, stored in the cloud storage or ldap, that can be decrypted with a passphrase from the client to include in the client app.