haiwen / seafile

High performance file syncing and sharing, with also Markdown WYSIWYG editing, Wiki, file label and other knowledge management features.
http://seafile.com/
Other
12.26k stars 1.54k forks source link

Encrypted libraries leak lots of information #350

Closed ef4 closed 4 months ago

ef4 commented 11 years ago

I spent some time auditing the crypto constructs for Seafile's encrypted repos, because I'd like to help make Seafile more secure and trustworthy. I found some significant problems.

An attacker who obtains a copy of the encrypted library without the key can:

Furthermore, since the same initialization vector is reused for all chunks, the library is vulnerable to watermarking and known-plaintext attacks.

The first problem is straightforward to solve: encrypt all file and directory entries, not just the content chunks.

The second problem (predictable IVs) is not as easy to fix. To maintain seafile's existing deduplication and synchronization capabilities, you want deterministic encryption. But maintaining semantic secrecy with deterministic encryption is probably not possible.

As a practical improvement, you could use an HMAC of each chunk as its IV. This is still deterministic, but it would at least prevent chunks with the same prefix from sharing the same ciphertext prefix.

To achieve strong secrecy, Seafile would need to give up deterministic encryption. This can still provide reasonable reduplication and efficient sync, but it would require clients to maintain their own cached mapping from chunk sha1 sums to their encrypted identities. You may want to look at how the Tarsnap client does something similar.

moschlar commented 5 years ago

@killing

What @ef4 originally stated is that the IV is reused for all the files in the same library.

Isn't it for every chunk, actually?

killing commented 5 years ago

@moschlar Yes chunk is more accurate description for that.

marcusmysc commented 5 years ago

Glad to hear!

Redsandro commented 5 years ago

@killing said:

We'll fix the salt issue recently and include it in 7.0 version of Seafile server. Also a new version of client is needed to work with the new encrypted libraries.

This might be only semi-relevant: The Ubuntu 18.04 LTS version of the Seafile client is frozen, not updated for a while when updates to the client are available for Windows and OSX. I'm not sure what the (technical) reason is behind this, but if a client update is mandatory to support the server, will this mean that it is best not to update the server while some clients are still on Ubuntu 18.04?

This might annoy to some degree, because 18.04 LTS has 10 years of support and some LTS folks are perpetually reluctant to update their system.

shoeper commented 5 years ago

The official PPA linked at https://help.seafile.com/en/syncing_client/install_linux_client.html should deliver the most recent version of the client.

Redsandro commented 5 years ago

@shoeper I just double-checked, and indeed the version from the PPA was updated last year, it is not actually the most recent version. So it does get updated sometimes. Do the Linux- and Windows binary have alternating versions?

deb http://ppa.launchpad.net/seafile/seafile-client/ubuntu bionic main is at 6.2.9, while win and max are at 6.2.11.

shoeper commented 5 years ago

Normally not, but I don't know. Looking at the changelog (https://manual.seafile.com/changelog/client-changelog.html) it looks like the most recent changes were not worth an update for linux.

killing commented 5 years ago

We've been working on to use different salt for each library. Currently there is a PR available: https://github.com/haiwen/seafile-server/pull/221

freeplant commented 5 years ago

This feature is included in version 7.0. To enable using new encrypted libraries with different salt for different libraries, you need to add the following configuration to seahub_setting.py

ENCRYPTED_LIBRARY_VERSION = 3

Currently the mobile clients and desktop clients does not supported new version of encrypted library yet. After we upgrade the clients, we will make the setting as default.

Redsandro commented 5 years ago

Thank you for the update @freeplant :+1:

raphmim commented 5 years ago

Hello, I'm only semi-technical so I didn't fully understand the entire thread above. Can someone help me understand: does the fact that SeaFile now uses "different salt for each library" mean that the amount of data leaked by encrypted libraries is reduced? Or that they don't link any information? Thanks

shoeper commented 5 years ago

No

killing commented 5 years ago

@Raph33 Use different salt for each library makes it harder for an attacker to crack the password or contents of the files in the encrypted libraries. But the metadata (e.g. file names, folder names, file sizes) are still stored unencrypted.

killing commented 5 years ago

Update: desktop clients 7.0 already support the new encrypted library format.

x11x commented 5 years ago

Seafile doesn't use the same IV for every library. The IV is calculated from the library password and the salt. Even though the salt is reused in all libraries, the passwords are usually different.

Now with the new encrypted library format, salt is different for each file. This mitigates problems with the password possibly being the same for different libraries.

But the IV is still calculated from the library password and the salt. So the IV will be the same for all files/chunks in each the library? My understanding of AES CBC is that the IV should never be reused - there should be a unique IV for each file.

I may be misunderstanding the challenges but it seems like the IV should just be a random string stored in plain text in some kind of mapping from chunk to IV. The IV itself does not need to be kept secret if I understand correctly. It does not (should not?) need to be generated from the password/salt.

I wonder if the authors have considered libsodium / NACL - which takes away the need to worry about IVs and salts and whether you're doing it properly. There are a range of browser libraries available too.

@Raph33 the way I see it, this doesn't change much, the attacks described above still apply, but just for files within the same library.

Edit: I've reread the original issue and understand that random IVs might not be possible with the current sync/dedup algorithm and that using a HMAC was recommended. And I get that this would mean changes in the frontend like the inclusion of a crypto library.

moschlar commented 4 years ago

@killing Is there a way to force the Seafile client to only use the new encryption format? At runtime or at compile time (for the Debian packages)?

Redsandro commented 4 years ago

Seafile 7 can use both? How can I see what version/format my libraries use?

andreagrax commented 4 years ago

Update: desktop clients 7.0 already support the new encrypted library format.

@killing any update about mobile clients?

s-bernard commented 3 years ago

Hi, @killing Have you considered integrating an external project like cryptopmator to manage the encrypted libraries? It would be faster to develop and more secure at the end.

ghost commented 3 years ago

@killing The changelog for version 8.0 mentions the release of a new version of Seafile's encrypted library format. Can you please elaborate on what the differences between v3 and v4 are?

killing commented 3 years ago

The only change we made in v4 is changing from AES128 to AES256 for data encryption. Due to a mistake in the code, AES128 was chosen (only) in v3 as the encryption algorithm, which is less secure than AES256.

TheQuantumPhysicist commented 2 years ago

Almost a decade... has passed... and this issue is still open. Hard to believe.

stevesbrain commented 1 year ago

Almost a decade... has passed... and this issue is still open. Hard to believe.

Hard to believe that it's almost as long from when this issue opened to when I commented, as it is from my comment until now!

https://github.com/haiwen/seafile/issues/350#issuecomment-390987226

Redsandro commented 1 year ago

Hard to believe

In the developers' defense, they answered seven years ago, and repeated from time to time that no further improvements are planned. We may not like this, but we can´t be surprised they don´t have a different answer each time we ask. I assume this issue is intentionally left open in case anyone finds it important enough to pick it up themselves.

Currently, file contents are encrypted but metadata is not. This is similar to ecryptfs as used in Linux Mint or password protected zip files. Apparently they believe this content encryption is good enough to run on your own hardware, and rather work on different projects like SeaTable.

One frequently used solution for private clouds is to use ZFS storage with native encryption for your data, so when your hardware is stolen, no one can read the storage contents because the key to mount ZFS is on a different machine.

If you still think your metadata is so confidential that you can't trust it on your own hardware, you may want to search for a different file hosting implementation, although you may find that nothing beats Seafile in terms of speed. It's a trade-off you may be willing to make. Keep in mind that there aren't many (FOSS) mature solutions out there. NC has a disclaimer too:

Some features part of the design have not yet been implemented in the client or server code. In particular, as of May 2022, offline recovery, sharing and HSM features are on the roadmap.

NC does have a bigger team with larger resources, so you can be confident that any progress on beta features will be faster there than with Seafile.

To summarize some of your options:

  1. Reconsider what information is actually exposed in your workflow and how far you're willing to go to mitigate this
  2. Use a second layer of offline data protection such as ZFS
  3. Write the code or hire a developer yourself
  4. Use an alternative self-hosted file service

I don't think further "reminders" will yield different results.

andreagrax commented 1 year ago

Use a second layer of offline data protection such as ZFS

If you don't want to do it server side, you can also consider something client side like cryptomator (but it add another layer of complexity, because is another software that have to be installed client side)

killing commented 4 months ago

Thank you all for your contributions here. We have assembled the current known limitations on encrypted library feature. The list can be found in our manual: https://manual.seafile.com/security/security_features/ .

Currently we have no plan to further fix these limitations due to the complexity to implement them and upgrading existing installations. Looking back at our design, we think it's balanced between ease of use and the level of security. It's arguably not perfectly secure, but may be good enough for many use cases. It's up to the users to choose whether to use it or not. We'll keep transparent about whatever new limitations found and update the list.

For now we'll close this issue as there is no plan for "fixing" it.

We'll still improve the encrypted library feature from time to time, based on feedbacks and advancement in cryptography. For example, in version 12, we'll support Argon2 as the key derivation algorithm and also allow choosing key derivation algorithm from configuration.

unai-ndz commented 4 months ago

For anyone worried about this but still wanting to use seafile don't forget you can sync your libraries preencrypted. I'm using gocryptfs for the more sensitive ones. Although I haven't tested if you can access single files from a preencryted library without downloading it fully first.