signalapp / Flock

Private contact and calendar sync for Android.
https://signal.org/blog/flock
358 stars 80 forks source link

cryptography documentation #75

Open igoralmeida opened 9 years ago

igoralmeida commented 9 years ago

I was looking for some kind of document explaining your use of cryptography but could not find any, and the wiki seems empty. So, how are you implementing end-to-end encryption?

rhodey commented 9 years ago

hello @igoralmeida, thanks for the reminder about the wiki, I wasn't aware I had it enabled for this project. I'll add the wiki documentation to my TODO list but can provide a short overview of Flock cryptography here for the time being.

All cryptography related code can be found in the org.anhonesteffort.flock.crypto package, KeyUtil.java contains all the basic functions and KeyHelper.java ties all of them together.

USERNAME - user supplied username. PASSWORD - user supplied password. CIPHER_KEY - generated from random on first install. MAC_KEY - generated from random on first install. AUTH_TOKEN - PBKDF2 of PASSWORD with 20050 iterations. KEY_MATERIAL_KEY - PBKDF2 of PASSWORD with 20000 iterations.

CIPHER_KEY and MAC_KEY are used to encrypt, verify, and decrypt all user data, these keys never change.

USERNAME and AUTH_TOKEN serve as the username and password for authentication to all Flock Sync services. When using Flock with your own WebDAV server AUTH_TOKEN is simply the password for your WebDAV account without any PBKDF2.

KEY_MATERIAL_KEY is derived from a PBKDF2 of PASSWORD and used to encrypt, verify, and decrypt CIPHER_KEY and MAC_KEY. CIPHER_KEY and MAC_KEY need to be shared between all devices using the same WebDAV account so they are encrypted with KEY_MATERIAL_KEY and stored on the WebDAV server, this makes bootstrapping of new clients possible.

So when using Flock with Flock Sync the servers only ever see PBKDF2(PASSWORD, 20050) and because 20050 > 20000 it is not possible for the servers to derive KEY_MATERIAL_KEY from AUTH_TOKEN.

marciomr commented 9 years ago

@rhodey, do you (on WhisperSystems) intend to develop desktop clients for Linux, Windows and/or Mac? Maybe we can help with this, but we would need more documentation.

generalmanager commented 9 years ago

@marciomr As far as I'm aware they don't want to build graphical desktop clients but a kind of proxy which acts as a client to the actual webdav server, decrypts the content and then provides the decrypted content locally, acting as a server.

marciomr commented 9 years ago

Thanks @generalmanager, Do you know can I contribute to this project?

generalmanager commented 9 years ago

Sure! Bigger contributions like this should be announced on the mailing list first to clear up misconceptions and discuss strategic decisions: whispersystems@lists.riseup.net https://lists.riseup.net/www/info/whispersystems

It would be great if you had the time to get this off the ground!

patcon commented 9 years ago

Hey guys, I was just starting to look into requirements for this. I'm definitely not at the point of making any mailing list announcements, as I'm still gauging whether it's within my skills. You on IRC @marciomr, so we can talk or at least make sure we're not stepping on each other's toes at this point? (I'll be in the #whispersystems channel)

(I'm patrick.c.connolly on skype, and @patconnolly on twitter, but IRC is better.)

patcon commented 9 years ago

Was thinking about using argo, a project that apparently drives a decent chunk of the backend at Apigee. It's essentially a extensible reverse proxy for API's that allows plugins to transform headers and data. Haven't had a chance to use it yet, but saw a presentation a few years ago and was really impressed.

marciomr commented 9 years ago

@patcon nice! Let's talk. I think I need the IRC server address. Let me just explain that I'm a comptuer science teacher in Brasil and I was thinking of suggesting this as a project for some student, but we can talk in the IRC. I think it is more appropriate.

patcon commented 9 years ago

Great! I'm "patcon" con Freenode, and you can find me with /whois patcon when I'm online, but I should be in #guardianproject or #whispersystems

I'm going to try to meet up with @rhodey this weekend to get a reference webdav server available as a Vagrant/docker box

rhodey commented 9 years ago

hey all, sorry for not checking in with this thread sooner! thanks for the support :)

as @patcon mentioned we're gonna try and meet up this weekend, after that I'll check back in with this thread and maybe we can setup some resources to help facilitate and organize collaboration on this project.

cheers,

marciomr commented 9 years ago

Hi guys, I subscribed to the discussion list and I am trying yo visit the IRC once in a while. Have you guys manage to meet? I talked to a student and he seemed interested in the project.

I think the ideia of having a proxy to do the encryption/decryption process in the desktop is very smart, but I don't think I understand how argo would solve that.

untitaker commented 9 years ago

I am interested in extending my own synchronization tool with Flock-compatible encryption capabilities. I do think a proxy for CalDAV and CardDAV is the most practicable solution, but I'd still like to see Flock's protocol get documented in a implementation- and language-independent way.

rhodey commented 9 years ago

Hello all,

Checking in to let you know I've started documenting the Flock sync protocol, you can find the most current revision of this document in the "Sync Protocol" section of this document

My hope is to finish protocol documentation in the next week, at that time I will move the document from that repo of mine to a wiki on the OWS Flock repo.

untitaker commented 9 years ago

Awesome!

untitaker commented 9 years ago

@rhodey What I don't understand is why Flock is using a CardDAV/CalDAV server if the end-to-end-encryption prevents the server from understanding the item contents. IMO that's the only reason why you would want to use such terrible protocols: To stay compatible with the rest of the infrastructure, but that compatibility is lost with Flock. In that case you could use any arbitrary protocol.

patcon commented 9 years ago

@untitaker See https://github.com/WhisperSystems/Flock/issues/16#issuecomment-49681527 for the rationale :)

untitaker commented 9 years ago

I don't think that explains the choice of WebDAV. Yes, those are popular protocols, but Flock isn't compatible at all, it merely uses them as a way to store its encrypted data.

patcon commented 9 years ago

Sorry, to summarize, tools like thunderbird know how to talk and sync with webdav servers. They just don't know how to decrypt/encrypt. If you create a cross-platform local proxy app to route the data through (https://github.com/WhisperSystems/Flock/issues/16), then it can do that part and you can use desktop tools on any platform (osx, windows, linux, etc.) with flock data, using only the proxy as a bridge.

Anyhow, hopefully that explains how webdav makes sense and using a custom comm protocol would be more work for less value :) I'll leave it at that

untitaker commented 9 years ago

I understand that, but I wonder why Flock would go through the trouble of supporting so many different WebDAV servers instead of

  1. Creating its own sync server and protocol
  2. Creating a WebDAV-proxy (for Thunderbird) to that server and protocol.
patcon commented 9 years ago

In theory, there's no "supporting so many different WebDAV servers", as they're using the standard protocol. All servers should work, if them implement the spec correctly. In practice, that's less true. Not sure if they knew that when this was started as a spring break of code project

Anyhow, I'm in #whispersystems on freenode. (In-depth tangential discussion in issues can make them discouragingly long for later readers :)

rhodey commented 9 years ago

@untitaker building flock on top of WebDAV was a mistake. WebDAV is bloated as a specification and because of the verbosity of XML it is bloated as a protocol on the wire as well.

The cryptography scheme I described in the protocol doc above could easily be adapted to fit other protocols, I'd really like to see someone do this with the Remote Storage spec. Remote Storage has super lean specifications and is uses JSON.

The ideal Flock Sync server would be protocol agnostic and be able to serialize contacts or calendars into XML or JSON representations depending upon the protocols supported by the client (WebDAV, RemoteStorage, etc).

untitaker commented 9 years ago

@rhodey Would you adopt a protocol based on Remote Storage for Flock?

untitaker commented 9 years ago

The problem I see with the remote storage spec is that basically it seems to be a filesystem with mimetype annotations. Currently there are a few CalDAV clients which are able to sync only a specific timerange, for performance reasons and also because storage on mobile devices is rather limited. With the CalDAV API, an efficient query for events within a given timerange is possible. I don't see an efficient way to do that with the remoteStorage API, at least not in a way that relies on the client to generate indices.

...just realized that it doesn't matter if one is end-to-end encrypting everything. But if one weren't, using remoteStorage over *DAV would be a major bummer for performance.

gusennan commented 9 years ago

@untitaker your last comment about performance, is that because of your thoughts about the necessity of being able to create indexes on the client, or is there another reason?

untitaker commented 9 years ago

@gusennan It's about not having the whole collection on the client, but only a specific range that is relevant for the current time. It's only a performance issue for extremely large collections and also exists in Flock AFAIK.

gusennan commented 9 years ago

@untitaker which protocol do you think would be better for than remoteStorage or DAV for flock?

untitaker commented 9 years ago

@gusennan Note that I'm not affiliated with Flock. I think we'd have to try it out, but remoteStorage is probably a better fit for Flock than DAV.

gusennan commented 9 years ago

@untitaker Thank you for the answers. Neither am I; just have been browsing this codebase and the comments you made caught my attention--was just curious what you thought.

untitaker commented 9 years ago

@gusennan To reiterate, the comments I made about performance are quite irrelevant in the context of end-to-end encryption (and I noticed my mistake only in the last paragraph).

Please continue conversation about DAV alternatives at https://github.com/WhisperSystems/Flock/issues/93

untitaker commented 9 years ago

@rhodey A question from a crypto newbie who hasn't used Flock: It isn't documented how you get the server password and EPASSWORD. Do you actually have two password fields for the user?

Also this:

If you're using any reports other than the multiget you're doing something weird anyway.

Mobile clients use reports for calendars to only store a relevant timerange of events on the device. This is useful if your calendar collections are truly massive.

patcon commented 9 years ago

@untitaker I think you answer is here: https://github.com/WhisperSystems/Flock/issues/75#issuecomment-61890855

on the flock servers, auth password is the result of using pbkdf2 on your provided password. when self-hosting, you just have two passwords, one for auth and one for encryption.

untitaker commented 9 years ago

Ahh I see. So in the end Flock determines the password I have to set for my WebDAV server.

patcon commented 9 years ago

When you self-host, they are separate, so you decide both. When using the OWS service, flock simplifies things by deriving the auth password from your own password that they never see :) They handle registration when you use their servers, so they could easily do that

untitaker commented 9 years ago

@rhodey Is there any sort of integrity check that verifies the server didn't just delete some of the encrypted items? Was this sort of threat considered when building the threat model? What lead to the final decisions?