element-hq / element-web

A glossy Matrix collaboration client for the web.
https://element.io
GNU Affero General Public License v3.0
11.27k stars 2.02k forks source link

Encrypt attachments with E2E #2262

Closed ara4n closed 8 years ago

richvdh commented 8 years ago

https://matrix.org/jira/browse/SPEC-24

ara4n commented 8 years ago

I just had a quick think about this, mainly for curiousity and wondering how much work it might be to do. From a quick inspection it looks like there are two possible routes:

1. add to inbound/outbound_group_session the ability to get the current message key details (aes_key, aes_iv) from the cipher (which would need to be called immediately after decrypting a message, or immediately before encrypting a message, to capture the ratchet at the right point).

The key details can then be called by the app to do AES CBC on the file being uploaded/downloaded using whatever library they like (e.g. webcrypto). When uploading, this would be by handing XMLHttpRequest.send() a wrapper around a File object which wrap its read methods to pass through AES. When downloading, similarly this would be taking XMLHttpRequest's output and passing it through AES and handing it to https://github.com/eligrey/FileSaver.js or something. I'm a bit fuzzy on whether it's possible to nicely do either of these as a stream, but probably not the end of the world to load it into RAM for now given attachments are limited to 20-50MB by default.

2. add to inbound/outbound_group_session the ability to encrypt/decrypt binary streams. They would then use the existing libolm ciphers for doing the honours, passing it back to the app. They would need to be called at the right times (after decrypt, or before encrypt). The advantage is that this uses a consistent cipher implementation for both msg & attachment. The disadvantage is that you needlessly pass through emscripten code (especially if libolm delegated to webcrypto rather than using the brad conte primitives), and you have to keep state whilst streaming through the library and to avoid nasty races when the ratchet advances but you're still encrypting/decrypting file streams. In general, it feels like more C code and a more complicated API, and could slow things down thanks to the emscripten layer.

My hunch would be towards option 1.

ara4n commented 8 years ago

thinking further - we have a comedy race between picking the AES key used to encrypt an attachment and the fact that tue ratchet may move on whilst the attachment is being encrypted. How about we just pick a random AES key for the attachment, and include it in the event metadata? Then we don't need to faff around extracting message keys out of megolm.

Separately from this, we will also need to extend the message format to allow explicit client generated thumbnails to be sent given the server cannot do the thumbnailing. I suggest we do this in a generic manner (like the proposed "human representation of events" idea for putting arbitrary text descriptions on events), so that any type of event could have an optional client-specified thumbnail. This would also solve the problem of thumbnailing SVGs (by making it the sending client's problem).

richvdh commented 8 years ago

agreed on all fronts

NegativeMjark commented 8 years ago

It'd be good if the event included a SHA256 hash of the encrypted content so that the client can check the integrity of the attachment before decrypting it.

eternaleye commented 8 years ago

@NegativeMjark: That would be far better handled by using a proper AEAD rather than using AES-CBC and then gluing a hash on top.

NegativeMjark commented 8 years ago

@NegativeMjark: That would be far better handled by using a proper AEAD rather than using AES-CBC and then gluing a hash on top.

Yeah, something like AES-GCM would be much more sensible.

Having a hash in the event could be useful if you wanted to retrieve it from a content addressed filesystem or similar. But I doubt that you'd be able to use SHA256 directly for that purpose.

richvdh commented 8 years ago

This now works, modulo large attachments -- see #2615. We still need to document it though (https://github.com/matrix-org/matrix-doc/issues/461).