radumarias / rencfs

An encrypted file system written in Rust that is mounted with FUSE on Linux. It can be used to create encrypted directories
Apache License 2.0
118 stars 27 forks source link

Generate unique nonce #47

Open radumarias opened 5 months ago

radumarias commented 5 months ago

Using random nonces runs the risk of repeating them unless the nonce size is particularly large (e. g. 192-bit extended nonces used by the XChaCha20Poly1305 and XSalsa20Poly1305 constructions.

NIST SP 800-38D recommends the following: The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.

Following this guideline, only 4,294,967,296 messages with random nonces can be encrypted under a given key. While this bound is high, it's possible to encounter in practice, and systems which might reach it should consider alternatives to purely random nonces, like a counter or a combination of a random nonce + counter.

To ensure unique nonce between instances we will have this flow:

This reduces the number of invocations with unique key+nonce. Given our block size is 256 kB we can encrypt 7.923×10^16 PB (petabytes)

A possible problem:

Consider this example: one uses your program, creates a backup of the data, keeps using the program. If they restore the backup, then they will restart from an earlier counter, and reuse the same nonces.

To mitigate this we could keep the nonce_seq in keyring also and use max(keyring, app_data).

An alternative to counters is to have some logic to assign unique IDs to blocks, and use that ID to derive a nonce. You may get some inspiration on how to do that from https://en.wikipedia.org/wiki/Disk_encryption_theory, in particular the ESSIV section.

If you're concerned about nonce reuse, you might want to look into XChaCha20-Poly1305, which allows longer nonces.

Some popular constructs often used for disk encryption are AES-XTS and HBSH. You'll find it useful to research why they're used specifically for disk encryption.

radumarias commented 4 months ago

from: corbellini.andrea@gmail.com

Hi Radu,

From the information you've given in the email, I believe your calculations are correct. With an atomic counter you'd have 2^(96-8) unique nonces, hence 2^(96-8) blocks to encrypt, hence 2^(96-8) * 256 kB of data, which matches your calculations.

It is generally safe to use the same key to encrypt multiple files, as long as the nonce-key combination is never reused. In the encryption world, there is something called "key wear", which is the concern that if you encrypt large amounts of data with the same key, someone may be able to obtain some information from the encrypted data. There is no consensus on when "key wear" happens, and there are no known practical attacks based on key wear, this is mostly a concern about potential future attack.

I would like to add some information to the mix, hoping that you'll find them useful: Managing counters is hard. There are at least two major challenges: one is with distributed systems (multiple processing units incrementing and obtaining the counter), one with persistence of the counter. The second problem is perhaps the hardest in your case. Consider this example: one uses your program, creates a backup of the data, keeps using the program. If they restore the backup, then they will restart from an earlier counter, and reuse the same nonces. Another case to consider is: what happens if the computer suddenly loses power? Is there a chance that the encrypted blocks get persisted to disk, but the counter doesn't?

An alternative to counters is to have some logic to assign unique IDs to blocks, and use that ID to derive a nonce. You may get some inspiration on how to do that from https://en.wikipedia.org/wiki/Disk_encryption_theory, in particular the ESSIV section.

If you're concerned about nonce reuse, you might want to look into XChaCha20-Poly1305, which allows longer nonces.

Some popular constructs often used for disk encryption are AES-XTS and HBSH. You'll find it useful to research why they're used specifically for disk encryption. Hope this helps!

radumarias commented 2 weeks ago

questions

tob-scott-a commented 2 weeks ago

NIST SP 800-38D recommends the following: The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.

This is for 96-bit nonces. As a rule of thumb, you can cube-root the probability space (in the case of XChaCha20-Poly1305, you start with 2^192) and that is the reciprocal bound such that 2^(n/3) nonces, sampled from a uniformly random distribution, will have a 2^-(n/3) probability of collision.

With 96-bit nonces, 2^(n/3) is 2^32, for a collision risk of 2^(-32). Which basically means, you're at the bound where the probability of a collision is inversely proportional to the number of nonces already generated.

Some people have taken 2^-32 to be a magical safety boundary for NIST compliance. In which case, the 192-bit nonces of XChaPoly gives you 2^80 nonces before 2^-32 risk.

But there's nothing inherently magical or special about 2^-32. A better value to target is the cube-root point I mentioned above.

A nonce space of 2^192 (thus, n = 192) yields 2^64 nonces for a collision risk of 2^-64. That's much easier to reason about, and in line with the reasoning that underlies the NIST recommendation.

To get a sense for the scale, if you encrypted 2^64 1-byte files with XChaPoly, assuming 192-bit nonces and 128-bit authentication tags, you would exceed 750 exabytes of storage required.

If this is still insufficient for your security goals, I recommend adding a key-derivation step with a large (e.g., 256-bit) random value appended to the HKDF info parameter. Bonus: You can also append whatever other fields you want as the "info" parameter. This also provides a nice context-binding, especially if you decide to derive multiple keys from the same input keying material (IKM).

You can then zero out the nonce or make it an extended sample of randomness. Here's a quick sketch of the latter:

random := randomBytes(32)
tmp := hkdfSha512(ikm, 64, pae('encryptionKey', random, instance_id, ...), nil)
encKey := tmp[0:32]
nonce := tmp[32:56]

(Above, pae() refers to a function like PAE() from PASETO. See also: TupleHash.)

You need not limit yourself to 2^32 messages with XChaCha20-Poly1305.

Further reading:

radumarias commented 1 day ago

thank you very much, very informative and good reading, will take it into account.

On Tue, Nov 12, 2024 at 4:58 PM Scott Arciszewski @.***> wrote:

NIST SP 800-38D recommends the following: The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.

This is for 96-bit nonces. As a rule of thumb, you can cube-root the probability space (in the case of XChaCha20-Poly1305, you start with 2^192) and that is the reciprocal bound such that 2^(n/3) nonces, sampled from a uniformly random distribution, will have a 2^-(n/3) probability of collision.

With 96-bit nonces, 2^(n/3) is 2^32, for a collision risk of 2^(-32). Which basically means, you're at the bound where the probability of a collision is inversely proportional to the number of nonces already generated.

Some people have taken 2^-32 to be a magical safety boundary for NIST compliance. In which case, the 192-bit nonces of XChaPoly gives you 2^80 nonces before 2^-32 risk.

But there's nothing inherently magical or special about 2^-32. A better value to target is the cube-root point I mentioned above.

A nonce space of 2^192 (thus, n = 192) yields 2^64 nonces for a collision risk of 2^-64. That's much easier to reason about, and in line with the reasoning that underlies the NIST recommendation.

To get a sense for the scale, if you encrypted 2^64 1-byte files with XChaPoly, assuming 192-bit nonces and 128-bit authentication tags, you would exceed 750 exabytes of storage required https://www.google.com/search?q=%2824+%2B+16+%2B+1%29+*+2%5E64+bytes.

If this is still insufficient for your security goals, I recommend adding a key-derivation step with a large (e.g., 256-bit) random value appended to the HKDF info parameter. Bonus: You can also append whatever other fields you want as the "info" parameter. This also provides a nice context-binding, especially if you decide to derive multiple keys from the same input keying material (IKM).

You can then zero out the nonce or make it an extended sample of randomness. Here's a quick sketch of the latter:

random := randomBytes(32) tmp := hkdfSha512(ikm, 64, pae('encryptionKey', random, instance_id, ...), nil) encKey := tmp[0:32] nonce := tmp[32:56]

(Above, pae() refers to a function like PAE() from PASETO. See also: TupleHash.)

You need not limit yourself to 2^32 messages with XChaCha20-Poly1305.

Further reading:

— Reply to this email directly, view it on GitHub https://github.com/radumarias/rencfs/issues/47#issuecomment-2470760968, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5TC4EM2U2UG2YLPVJANYL2AIJQHAVCNFSM6AAAAABJ6WUQTOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQG43DAOJWHA . You are receiving this because you were assigned.Message ID: @.***>

-- And in the end, it's not the years in your life that count. It's the life in your years.