cryptee / web-client

Cryptee's web client source code for all platforms.
https://crypt.ee
Other
450 stars 23 forks source link

[Question] How does Ghosting work on Cryptee? #142

Closed ghost closed 2 years ago

ghost commented 2 years ago

Hola! Thank you for creating such a beautiful application.

While reading about Ghost album, I got curious about the design. The website says that

because cryptee is unable to determine whether if you have any ghost folders or not, the only way to export them is to first summon (bring back) your ghost folders / albums.

Question: The create ghost album API sends the hashed album name and album-id to crypt.ee server. Does this not leak the ghost album name (as it's hashed, not encrypted). Also, the back-end must be storing a mapping of album_id -> hashedTitle, so crypt.ee should be able to identify if I have any ghost folder or not.

johnozbay commented 2 years ago

Hi there!

Thanks a lot for the kind compliments! Glad to hear you're enjoying Cryptee πŸ™πŸ»

The mechanism is a bit more complicated than that – I wrote a lengthy answer here about how it all works : https://github.com/cryptee/web-client/issues/33#issuecomment-563244601 TLDR;

Step 1) You click "Ghost a folder"

Step 2) You confirm its name by typing it.

Step 3) In order to be able to retrieve a ghosted folder to you by its name, the server needs to know the name of the folder. BUT. Remember, we don't want to know the names of your folders, especially your ghost ones. So what do we do? – In order for the server not to know your folder's name, your folder name is hashed on your device.

Step 4) Your folder name's hash + folder id + user id is passed to the server.

Step 5) [ server-side ] Server takes the folder name's hash (FNH) + user id (UID) and, hashes these two together to create a unique ghost ID (GID). –– So the server still doesn't know your folder name, nor can we look at the GID and be able to tell which user id it belongs to after it's hashed. –– All your folder's contents are copied to this ghost folder with the unique GID, and then the original folder is removed from your account.

–– You've now successfully ghosted something. Now. Even if a govt comes knocking our door to say, hey, hand us over all the ghost folders of this user id, we can't do this. Because we can't find out which ghost id (GID) belongs to which user id, because we don't know the hash of the foldername, to hash with the userid and compare. So it's impossible to tell which ghost folder belongs to which user by looking at our server.


To address your specific point / question – there's two important things to keep in mind, and both sort of boils down to your threat model.

1) When you ghost something (folder, album etc) which you'll need to bring back by typing its name again, you're effectively changing your trust / threat model because you no longer trust your own device / account. i.e. let's say an abusive and overly-curious partner knows your username password and encryption key and can access your account, so you want to hide stuff from them. So in the interest of not keeping the folder in your account (which you don't trust), you tell Cryptee's servers: "hey safe-keep this folder/album for me until I tell you I need the folder/album named xyz again".

So the server must necessarily know either the name of the folder or its hash to be able to bring it back to you when you need it. Because the servers don't know your encryption key, if the ghost folder's name is stored encrypted somehow, servers wouldn't be able to bring back the folder/album. Since we don't want to know your foldernames, we can't store them plaintext either, so the next best thing is to store it hashed.

2) But hashing would mean if your folder is named "divorce" we could easily figure the name out (and perhaps its contents too) based on a hash-table. So to get ahead of this, we re-hash the hash with your user id on the server side at the moment of ghosting it. This has to be done in a trusted environment on the server side, since we can't trust client-side code. – as a malicious actor who modifies the client-side code could otherwise send forged user-id + folder hash combinations. –


The 'theoretically' better option is actually to use HMACs instead of regular hashes. But upon consideration, we realized this also has disadvantages.

a) we'd need to derive unique HMAC keys from your own key, but remember, you don't trust your device/client (because you no longer trust your own device/account, as someone –let's say your abusive partner– has access to your key) So from the perspective of your own security, HMAC doesn't provide any additional advantages, as this abusive partner already knows your key.

And while an HMAC would make it impossible for Cryptee's servers to look at the hash and try to figure out what your folder is named, it opens the door to some other types of edge-cases / errors like :

b) what happens if you change your encryption key? All your ghosts' derived HMAC keys would need to be re-encrypted. Therefore client app would need to re-download all your ghost's derived HMAC keys, and re-HMAC your foldernames, which would reveal you have ghosts, incl the number of ghosts. + these derived HMAC keys would need to be stored somewhere, still revealing that you have ghosts & the number of ghosts etc.


So to sum things up, as counter-productive as it may sound like, regular hashes (combined with your user ids on the server side) provide the best of both worlds. It's not 100% perfect, nor can it ever be, since the threat model of ghosting, inadvertently requires trusting us vs your own device/account. The current mechanism we have in place makes it pretty much impossible for us to take any guesses by looking at hashes. So while it's not ideal, we're of the opinion that it's still the best option.

Hoping these make sense, I could clarify a few things and answer your question! ✌🏻

Best,

John

ghost commented 2 years ago

Hey John, That was a very thorough explanation. Thank you for sharing it. I only have two minor follow up question.

Best, saruuto

johnozbay commented 2 years ago

Hey Saruuto!

Excellent questions!

– If you ghost / summon two folders with the same name, they are summoned one at a time in the reverse order in which they were ghosted. The server checks if the GID exists, and adds a -1, -2, -3 to the end. Like gid-1, gid-2, gid-3 etc... So they're kept in ghost-time order, and not messed up in any way.

– When you delete an account, the ghost folders and files in them are lost in eternity, as we lose all pointers to them on the server side. They become yet another sand grain on a beach basically, making it even harder to spot what belongs to who. It has a storage cost impact to us, but not significant enough to make a dent in our wallet.

In case if you're getting ready to ask the next question – So what happens if some day there is a UID collision, couldn't a fresh user who gets a collided UID summon these lost-for-eternity-ghost-folders?

Setting aside the theoretical impossibility given that there's something like 36^28 possibilities for a UID, 37,711,171,281,396,032,013,366,321,198,900,157,303,750,656 to be specific, not only these would need to be exactly the same, but also the folder name and users' encryption keys would need to be identical as well. So that's as good as it gets. So it's beyond improbable.

– Similarly, what happens if a user exceeds their storage quota etc and you guys need to delete their accounts for not paying their bills etc? Same. Ghost folders / albums would be lost for eternity.


Also, to clarify one thing, if some day, we find an even better / even more private solution, we may change or improve this mechanism even more in a backwards compatible way.


Hoping this helps and makes sense,

Best,

J