Encrypted Key Storage/Exchange

toddatpurism commented 8 years ago

Ability to store encrypted encryption keys to be able to easily setup new client software.

Something like gpg encrypted file, stored in the cloud storage or ldap, that can be decrypted with a passphrase from the client to include in the client app.

joeyh-purism commented 8 years ago

If the data is decryptable with only a password, it's not strongly encrypted. Anyone who is able to access the encrypted data can use a password cracker to eventually decrypt it. A strong password and using many rounds can slow this down, but not prevent it.

It would be better to return a gpg encrypted file. But then we need to do gpg secret key management on the client side. Including backing up the gpg secret key in some way, which cannot use this service. Perhaps to a USB key. But what if it gets lost or breaks? The user's laptop would have a copy too, but perhaps both break at once, and then they're SOL.

joeyh-purism commented 8 years ago

How about a hybrid approach?

Storage:

Client generates an encryption key, and encrypts the data using the key.
Client sends the encrypted data to somewhere (could be anywhere, does not need to be same server used for the rest of this).
Client uses shamir secret sharing to split the encryption key into two shards.
Client uses the password to generate pw1, pw2, etc. (Each should be contain a subset of the password's entropy, probably overlapping.)
shard1 is encrypted using pw1, generating eshard1, which is stored on the server.
shard2 is encrypted using pw2, generating eshard2, which is stored on the server.

Retrieval:

Client uses the password to regenerate pw1, pw2, etc.
Client sends pw1 to the server, and requests it be used to decrypt eshard1.
Client downloads eshard2 from the server, and locally uses pw2 to decrypt it. (password2 is never exposed to any server)
Client uses shamir secret sharing to combine shard1 and shard2, resulting in the original encryption key.
Client downloads the encrypted data, and decrypts it.

(Note that steps 1-4 are only needed when the client doesn't have the encryption key locally cached which it normally would.)

The server does not let anyone access eshard1. It only provides the correct decrypted shard1 when presented with the correct pw1. It can rate limit to prevent brute forcing.

Since pw2 is never sent to the server, the server can only get at the whole data by brute forcing eshard2. (It will also need to brute force eshard1 unless key recovery has been performed and it knows pw1.)

The encryption method should be such that decrypting with the wrong password generates random data. This makes brute-forcing quite difficult because each shard is itself an unstructured, high-entropy piece of data. Makes it hard to know when the right password has been guessed. An attacker would have to brute-force all necessary shards together, combine them, and check if the result is a valid gpg keyring.

Also, there could be more than 2 shards. Another shard could be stored on eg, a USB key, or a server run by a different entity. Then even if one server manages to guess a password, it would not have enough information to reconstruct the encryption key.

joeyh-purism commented 8 years ago

It would be good to have purism run one server, and have two other servers run by two independent, trusted third parties (like riseup.net or the EFF).

Then, 6 shards could be generated, and only 4 be needed to reconstruct the encryption key. Each server stores 2 shards. So any 1 server could go down without causing a disruption. And 2 of the servers would need to collude to crack the key.

joeyh-purism commented 8 years ago

Background reading:

A Taxonomy for Key Recovery Encryption Systems http://faculty.nps.edu/dedennin/publications/TaxonomyKeyRecovery.htm

Verifiable Partial Key Escrow http://groups.csail.mit.edu/cis/pubs/shafi/1997-ccs.pdf

https://en.wikipedia.org/wiki/Secret_sharing

joeyh-purism commented 8 years ago

It would be good for there to be no direct way for servers to correlate related shards. If the server doesn't know which shards to combine, it can't target particular ones for brute forcing.

Ways the server could draw correlations include:

When the data was stored. (Avoid by waiting some period of time between uploading various shards, probably on the order of days.)
IP address used to store the data. (Could be avoided by eg using tor.)
When the data was retrieved.
Any user ID associated with the data. (Avoid by allowing anyone to store data and don't associate data with any user ID.)

The client can also do chaffing, by storing additional junk shards. If the key storage service is usable (ie, not too slow) for storing large enough files to be used to store the gpg-encrypted data files, then mixing those up with the gpg key shard files could also provide useful chaffing, as long as all files are chunked to the same size.

joeyh-purism commented 8 years ago

How to generate names for the shards? Add a pw0 generated from the user's passphrase, and then:

 hash(pw0 + username + shardN)

This can be attacked: The server can try to brute-force the pw0 of a user, and if they manage to generate a hash that they've seen, they now know that a shard belongs to that user and can proceed to brute forcing that shard.

It's important that knowing pw0 not make it easier to brute force pw1..6.

joeyh-purism commented 8 years ago

I've written up a more detailed design at https://joeyh.name/code/keysafe/

purism / purist

Encrypted Key Storage/Exchange #12