openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.69k stars 1.76k forks source link

Feature: ChaCha20-Poly1305 Encryption #8679

Open cyphar opened 5 years ago

cyphar commented 5 years ago

At the moment, ZFS native encryption only supports AES-CCM and AES-GCM (because that's what Solaris supports and because AES is generally more widely trusted for some enterprise deployments as well as some theoretical FIPS-compliance reasons).

However, ChaCha20-Poly1305 is now a mainstream cipher (it's the recommended default for TLS and OpenSSH) and has proven to be incredibly fast when implemented in software on all hardware (on par with AES-NI and much faster on systems that don't have AES-NI or an equivalent). It's also impervious to timing attacks due to its construction, and it's incredibly straightforward implementation had lead to much less fragility than AES-GCM.

It appears to me that adding ChaCha20-Poly1305 as a cipher wouldn't be too complicated (because the key and nonce sizes are the same as the AES constructions used). And the implementations of ChaCha20 and Poly1305 are so short that an implementation of the full algorithm is on Wikipedia (though I would suggest asking to use the Zinc implementations for WireGuard since it is formally verified and has some neat optimisations).

While looking into this, the one question I have is why we have a MAC in the BP at all -- AES-CCM and AES-GCM (and ChaCha20-Poly1305) are all AEAD constructions and thus are already authenticated without the need for an additional HMAC. Is it just for additional security? Doesn't this mean that a user without the keys cannot verify the contents as well as someone who has the keys?

rlaager commented 5 years ago

Is there an HMAC in the BP? I always assumed it was the AEAD authentication tag that was stored in the BP. @tcaputi, can you confirm?

rlaager commented 5 years ago

If we go down this road, it absolutely should be the XChaCha20-Poly1305 variant, where random nonces are safe to use. See: https://libsodium.gitbook.io/doc/secret-key_cryptography/aead/chacha20-poly1305

Hmm, I wonder if the ZFS on-disk format has space for a 192-bit nonce ("IV").

tcaputi commented 5 years ago

While looking into this, the one question I have is why we have a MAC in the BP at all -- AES-CCM and AES-GCM (and ChaCha20-Poly1305) are all AEAD constructions and thus are already authenticated without the need for an additional HMAC. Is it just for additional security? Doesn't this mean that a user without the keys cannot verify the contents as well as someone who has the keys?

The CCM / GCM tag is what is stored in the last 16 bytes of the bp (where the second half of the checksum would usually go). It is correct that someone without the keys cannot verify the data is cryptographically correct. The data is still protected against non-malicious bit rot / corruption with (truncated) normal checksums stored in the first half of the "normal" checksum space.

I don't really know anything about ChaCha20, but the encryption implementation is designed to be expandable to more cryptographic algorithms that can meet the following criteria.

1) The algorithm must be produce a tag / MAC and store it in no more than 16 bytes 2) The algorithm must be able to use a 12 byte or fewer, randomly generated nonce. ZFS generates new keys using a 64 bit salt (also stored in the bp, which should allow for many random nonces before a collision occurs. 3) The algorithm must support additional authenticated data 4) We need to be able to put the algorithm into the ICP in a license-compatible way. Usually I would expect the algorithm to be written for Illumos first and then ported to the ICP.

I might be missing a few things, but that should be about it.

rlaager commented 5 years ago

The algorithm must be able to use a 12 byte or fewer, randomly generated nonce.

The length requirement rules out XChaCha20-Poly1305.

If the random requirement rules out ChaCha20-Poly1305, it also rules out AES-GCM too. Even in the existing algorithms, a counter (starting at a random value, if you prefer) would be safer than random, as it would guarantee the nonce is not repeated. However, that would presumably require locking around the counter, or for the nonce space to be partitioned across the threads.

tcaputi commented 5 years ago

Even in the existing algorithms, a counter (starting at a random value, if you prefer) would be safer than random, as it would guarantee the nonce is not repeated. However, that would presumably require locking around the counter, or for the nonce space to be partitioned across the threads.

That isn't really true. With a counter, the complications of writing data to permanent storage introduce attacks. Lets say we save the counter to disk. We have some data to encrypt so we increment the counter and encrypt the data. Before the data and the new counter make it to disk, we hard reset the machine. When we resume we will reuse the same nonce again.

The random nonce approach is used in many applications. For our purposes we ensure that there is never more than a 1 in 1 trillion chance that any 2 IV's match. If this extremely unlikely event does occur, it is not reproducible even if the attacker has basically full reign over the machine.

The length requirement. rules out XChaCha20-Poly1305.

I'd need to look into it more but I would bet that we don't need a full 192 bit nonce to be secure if we are rotating our keys often enough. For GCM and CCM you don't really need a nonce at all as long as you only encrypt one message per key.

rlaager commented 5 years ago

We have some data to encrypt so we increment the counter and encrypt the data. Before the data and the new counter make it to disk, we hard reset the machine. When we resume we will reuse the same nonce again.

Reusing the nonce is totally fine, as long as the key has changed. It's the combination of them together that must be unique. Didn't you say the key was rotated on pool import?

The length requirement. rules out XChaCha20-Poly1305. I'd need to look into it more but I would bet that we don't need a full 192 bit nonce

If you don't have the full 192 bit nonce, then just use ChaCha20-Poly1305 (no X).

cyphar commented 5 years ago

@tcaputi ChaCha20-Poly1305 matches all of those requirements except the 4th, but the 4th ought to be incredibly trivial since ChaCha20 and Poly1305 are very simple algorithms.

I was planning on copying the Zinc implementations -- Zinc is a new in-kernel crypto library for Linux which is in the process of being merged into mainline and is dual-licensed GPLv2/MIT. The hardest part of doing the port is going to be making the wrappers to fit ICP (ChaCha20-Poly1305 is so much simpler than AES-GCM/CCM that a lot of the complexity of ICP is not needed).

If I could get some pointers on how to submit a crypto-related patch to Illumos I would appreciate it. :D

@rlaager The keys are rotated every 400 million encryption operations in order to get a 1 in a trillion chance of nonces being reused. The math for how safe nonce reuse is is the same for AES-GCM/CCM and ChaCha20.

lin7sh commented 3 years ago

Zinc(with Wireguard) has merged into Linux Kernel, is there any plan to move this feature forward?

cyphar commented 3 years ago

I did take a look at doing this when I posted my last comment, but at the time (I'm not sure how different this is with the new OS-agnostic repo structure) the illumos-inherited crypto code was a bit above my pay-grade -- in particular quite a few of the callbacks didn't make sense in the context of XChaCha20-Poly1305 (especially so for Zinc which doesn't require allocations) so it wasn't clear how to do the porting. I'm busy with other things at the moment unfortunately, so I can't really pick this up at the moment.

robszy commented 3 years ago

I found also article why gcm is not good as chacha:

https://soatok.blog/2020/05/13/why-aes-gcm-sucks/