Open radumarias opened 5 months ago
from: corbellini.andrea@gmail.com
Hi Radu,
From the information you've given in the email, I believe your calculations are correct. With an atomic counter you'd have 2^(96-8) unique nonces, hence 2^(96-8) blocks to encrypt, hence 2^(96-8) * 256 kB of data, which matches your calculations.
It is generally safe to use the same key to encrypt multiple files, as long as the nonce-key combination is never reused. In the encryption world, there is something called "key wear", which is the concern that if you encrypt large amounts of data with the same key, someone may be able to obtain some information from the encrypted data. There is no consensus on when "key wear" happens, and there are no known practical attacks based on key wear, this is mostly a concern about potential future attack.
I would like to add some information to the mix, hoping that you'll find them useful: Managing counters is hard. There are at least two major challenges: one is with distributed systems (multiple processing units incrementing and obtaining the counter), one with persistence of the counter. The second problem is perhaps the hardest in your case. Consider this example: one uses your program, creates a backup of the data, keeps using the program. If they restore the backup, then they will restart from an earlier counter, and reuse the same nonces. Another case to consider is: what happens if the computer suddenly loses power? Is there a chance that the encrypted blocks get persisted to disk, but the counter doesn't?
An alternative to counters is to have some logic to assign unique IDs to blocks, and use that ID to derive a nonce. You may get some inspiration on how to do that from https://en.wikipedia.org/wiki/Disk_encryption_theory, in particular the ESSIV section.
If you're concerned about nonce reuse, you might want to look into XChaCha20-Poly1305, which allows longer nonces.
Some popular constructs often used for disk encryption are AES-XTS and HBSH. You'll find it useful to research why they're used specifically for disk encryption. Hope this helps!
questions
instance_id
as u16
?instance_id
with rand_chacha
is secure enough?data-dir
is secure enough not to reuse it after restore?NIST SP 800-38D recommends the following: The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.
This is for 96-bit nonces. As a rule of thumb, you can cube-root the probability space (in the case of XChaCha20-Poly1305, you start with 2^192) and that is the reciprocal bound such that 2^(n/3) nonces, sampled from a uniformly random distribution, will have a 2^-(n/3) probability of collision.
With 96-bit nonces, 2^(n/3) is 2^32, for a collision risk of 2^(-32). Which basically means, you're at the bound where the probability of a collision is inversely proportional to the number of nonces already generated.
Some people have taken 2^-32 to be a magical safety boundary for NIST compliance. In which case, the 192-bit nonces of XChaPoly gives you 2^80 nonces before 2^-32 risk.
But there's nothing inherently magical or special about 2^-32. A better value to target is the cube-root point I mentioned above.
A nonce space of 2^192 (thus, n = 192) yields 2^64 nonces for a collision risk of 2^-64. That's much easier to reason about, and in line with the reasoning that underlies the NIST recommendation.
To get a sense for the scale, if you encrypted 2^64 1-byte files with XChaPoly, assuming 192-bit nonces and 128-bit authentication tags, you would exceed 750 exabytes of storage required.
If this is still insufficient for your security goals, I recommend adding a key-derivation step with a large (e.g., 256-bit) random value appended to the HKDF info parameter. Bonus: You can also append whatever other fields you want as the "info" parameter. This also provides a nice context-binding, especially if you decide to derive multiple keys from the same input keying material (IKM).
You can then zero out the nonce or make it an extended sample of randomness. Here's a quick sketch of the latter:
random := randomBytes(32)
tmp := hkdfSha512(ikm, 64, pae('encryptionKey', random, instance_id, ...), nil)
encKey := tmp[0:32]
nonce := tmp[32:56]
(Above, pae()
refers to a function like PAE()
from PASETO. See also: TupleHash.)
You need not limit yourself to 2^32 messages with XChaCha20-Poly1305.
Further reading:
thank you very much, very informative and good reading, will take it into account.
On Tue, Nov 12, 2024 at 4:58 PM Scott Arciszewski @.***> wrote:
NIST SP 800-38D recommends the following: The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.
This is for 96-bit nonces. As a rule of thumb, you can cube-root the probability space (in the case of XChaCha20-Poly1305, you start with 2^192) and that is the reciprocal bound such that 2^(n/3) nonces, sampled from a uniformly random distribution, will have a 2^-(n/3) probability of collision.
With 96-bit nonces, 2^(n/3) is 2^32, for a collision risk of 2^(-32). Which basically means, you're at the bound where the probability of a collision is inversely proportional to the number of nonces already generated.
Some people have taken 2^-32 to be a magical safety boundary for NIST compliance. In which case, the 192-bit nonces of XChaPoly gives you 2^80 nonces before 2^-32 risk.
But there's nothing inherently magical or special about 2^-32. A better value to target is the cube-root point I mentioned above.
A nonce space of 2^192 (thus, n = 192) yields 2^64 nonces for a collision risk of 2^-64. That's much easier to reason about, and in line with the reasoning that underlies the NIST recommendation.
To get a sense for the scale, if you encrypted 2^64 1-byte files with XChaPoly, assuming 192-bit nonces and 128-bit authentication tags, you would exceed 750 exabytes of storage required https://www.google.com/search?q=%2824+%2B+16+%2B+1%29+*+2%5E64+bytes.
If this is still insufficient for your security goals, I recommend adding a key-derivation step with a large (e.g., 256-bit) random value appended to the HKDF info parameter. Bonus: You can also append whatever other fields you want as the "info" parameter. This also provides a nice context-binding, especially if you decide to derive multiple keys from the same input keying material (IKM).
You can then zero out the nonce or make it an extended sample of randomness. Here's a quick sketch of the latter:
random := randomBytes(32) tmp := hkdfSha512(ikm, 64, pae('encryptionKey', random, instance_id, ...), nil) encKey := tmp[0:32] nonce := tmp[32:56]
(Above, pae() refers to a function like PAE() from PASETO. See also: TupleHash.)
You need not limit yourself to 2^32 messages with XChaCha20-Poly1305.
Further reading:
- YOLO is not a valid hash construciton https://blog.trailofbits.com/2024/08/21/yolo-is-not-a-valid-hash-construction/
- Blowing out the candles on the Birthday bound https://soatok.blog/2024/07/01/blowing-out-the-candles-on-the-birthday-bound/
— Reply to this email directly, view it on GitHub https://github.com/radumarias/rencfs/issues/47#issuecomment-2470760968, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5TC4EM2U2UG2YLPVJANYL2AIJQHAVCNFSM6AAAAABJ6WUQTOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQG43DAOJWHA . You are receiving this because you were assigned.Message ID: @.***>
-- And in the end, it's not the years in your life that count. It's the life in your years.
Using random nonces runs the risk of repeating them unless the nonce size is particularly large (e. g. 192-bit extended nonces used by the XChaCha20Poly1305 and XSalsa20Poly1305 constructions.
NIST SP 800-38D recommends the following: The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.
Following this guideline, only 4,294,967,296 messages with random nonces can be encrypted under a given key. While this bound is high, it's possible to encounter in practice, and systems which might reach it should consider alternatives to purely random nonces, like a counter or a combination of a random nonce + counter.
To ensure unique nonce between instances we will have this flow:
u8
(unique indata_dir/seq
), use this asinstance_id
--instance-id
keyring
and use it next timeinstance_id | nonce_seq
nonce_seq
indata_dir/seq/instance_id/nonce
nonce_seq
and then use that in nonceThis reduces the number of invocations with unique
key+nonce
. Given our block size is256 kB
we can encrypt7.923×10^16 PB (petabytes)
A possible problem:
To mitigate this we could keep the
nonce_seq
inkeyring
also and usemax(keyring, app_data)
.