Open paulmillr opened 11 months ago
Hi, Thank you for opening the discussion. Here is an answer of point 3 for hash functions: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-185.pdf at Appendix B. They advised to use 128 bits with a similar formula.
Here are my ideas for the other points:
I do not know if there is realistic attack or not. The point is that according to the paper by Bier et al mentioned in the IRTF Draft, in Appendix C.3, the condition to have the function hashToPrivateScalar
behaving like a random oracle and thus outputting uniformly distributed values according to the security level you have is to choose k as explained early according to your security level. There is also an extension to the NIST curves in appendix C.4 which explains the numbers in FIPS 186-5.
For a 256-bit group you have about a 128-bit security level. Then according to the previous formula we have $L = 256 + 128 = 384$ bits. It is the output length you need out of the CSPRNG.
You can choose bigger values, the bias will be smaller.
You may add a security parameter to the function or does Fp.BYTES + Math.ceil(Math.log2(CURVE_ORDER)/(2*8))
work ?
For 4 I don't see how $Fp$ is related to the security at all. I think the only thing we should base it on is CURVE_ORDER
aka $Fn$ aka nByteLength
which we already have for all curves.
Yes you are right it should be the order of the group.
Regarding 1.
A 2^-64 bias means that the statistical distance between the uniform distribution and the actual distribution (assuming a perfect underlying rng) is 2^-64 (see the appendix of https://eprint.iacr.org/2023/1254.pdf, for a derivation of this bound).
Here is a survey on the consequences of partial information leakage: https://eprint.iacr.org/2020/1506.pdf.
That being said, I believe a bias of 2^-64 should be acceptable for all practical purposes!? (Even statistical security of 2^-40 can be considered practically secure.)
Thanks Matthias, that's very helpful.
Is it solved by 1545230ee59b6e360f0e660cc2eb567d46ff3805 ?
We've decreased bias, yes, but the security story still needs to be understood.
Bias in the context of cryptographic key generation refers to the deviation from a perfectly uniform distribution when generating random numbers, which can have security implications.
What does bias actually mean? Is there any realistic attack? How many keys would an attacker need to walk through before gaining a meaningful advantage?
Bias refers to the statistical deviation from a perfectly uniform distribution. In the context of cryptographic key generation, bias can be problematic because it may introduce vulnerabilities. While a bias of 2^-64 might seem small, it could potentially be exploited by an attacker who has knowledge of the bias and can use it to their advantage.
To understand the implications, consider an attacker who can observe multiple key generation attempts. With a bias of 2^-64, an attacker might need to analyze around 2^64 keys before gaining a meaningful advantage. This is a massive number and generally considered secure for most practical purposes. However, it's essential to remember that security depends on multiple factors, including the cryptographic primitives used and the specific context of the application.
The security implications also depend on the nature of the attack. The papers you cited (e.g., https://eprint.iacr.org/2023/1254.pdf) delve into the consequences of partial information leakage. For practical security, a bias of 2^-40 is often considered secure. However, the level of security required depends on the specific use case.
We might also ensure entropy quality by collecting sufficient bits (B) from a high-entropy source. Use NIST SP 800-90B recommendations to estimate the entropy. If collected entropy is H bits, the min-entropy is something like E_min = -log2(2^(-H/B))
.
How was the formula from hash-to-curve draft calculated? Why does it consider 384 bits from CSPRNG to match 128-bit security for a 256-bit curve?
The formula in the hash-to-curve draft calculates the number of bits needed from the CSPRNG to achieve a specific level of security. It takes into account the size of the prime (p) for the elliptic curve and the desired security level (k bits). The formula for L is:
L = ceil((ceil(log2(p)) + k) / 8)
For a 256-bit curve with 128-bit security (k = 128), the formula becomes:
L = ceil((ceil(log2(2^256)) + 128) / 8) = ceil((256 + 128) / 8) = ceil(384 / 8) = 48 bytes.
This calculation ensures that the randomness obtained from the CSPRNG is sufficient to provide a 128-bit security level for the 256-bit curve.
What other standards are out there with regards to bias?
There are various standards and recommendations related to bias in cryptographic applications. For example, NIST's guidelines for cryptographic algorithms (e.g., FIPS 140-2) provide recommendations for generating random numbers with minimal bias. Additionally, standards such as RFC 6979 offer deterministic approaches for generating cryptographic nonces, reducing the risk of bias. You can also refer to academic papers and recommendations from organizations like the IETF for more specific guidance.
DJB used bias for eddsa nonce: r = mod(sha512(prefix, msg), ed25519.l). Why not?
Bernstein used bias in the context of EdDSA (Edwards-curve Digital Signature Algorithm) to address concerns about the security of nonces. In the case of EdDSA, it's important to avoid reusing nonces to prevent certain attacks. The modulo operation used in generating 'r' ensures that nonces stay within a specific range and avoids any bias that could be introduced by raw CSPRNG output.
How should a generic rule look like? We can't use Fp.BYTES + Math.ceil(Fp.BYTES/2) because it's field - not curve group.
A generic rule for obtaining random values for cryptographic operations, such as private key generation, should consider the following:
As demonstrated in the hash-to-curve draft, we could use a formula that calculates the number of bytes (or bits) needed from the CSPRNG based on the curve's size and the desired security level. It shouldn theoretically, ensure that the generated random values meet the required level of security and avoid biases.
Mathias and Silvain provided some valuable insights, and you can refer to the provided papers for further details.
To avoid bias inside the randomness extraction function (e.g., HMAC-DRBG), I'd suggest, analyzing it's properties. We need to ensure that the function's output is statistically uniform, ideally using statistical tests like NIST SP 800-22.
Bernstein used bias in the context of EdDSA (Edwards-curve Digital Signature Algorithm) to address concerns about the security of nonces. In the case of EdDSA, it's important to avoid reusing nonces to prevent certain attacks. The modulo operation used in generating 'r' ensures that nonces stay within a specific range and avoids any bias that could be introduced by raw CSPRNG output
But there IS bias in eddsa. It's $2^{-259}$.
Oops ! You're right.
But I don't think it's always the case, no ? I might be wrong but... In this case, the hash function and modulo operation in the 'r' nonce generation might be the source issue.
r = sha512(prefix, msg) mod ed25519.l
Where,
To my understanding, the 'mod' operation introduces this bias. In practice, the 2^(-259) bias in EdDSA is generally considered negligible and doesn't pose significant security risks. But using SHA-512 for nonce generation in EdDSA may introduce potential issues related to those.
Again, I might be wrong and I'd be happy to be corrected.
SHA-512 has a fixed output length of 512 bits (64 bytes) meaning that regardless of the input size, the hash function always produces a 512-bit output. The nonce 'r' is calculated modulo 'ed25519.l' to ensure that it falls within a specific range. If the SHA-512 output is too long or contains more bits than needed, this can introduce biases.
When you take the modulo of a large number ('sha512(prefix, msg)') by a smaller number ('ed25519.l'), some values in the output range may have a higher probability of occurrence than others. This introduces bias, albeit very small, into the nonce generation process.
The bias can be quantified as 2^(-259), meaning that some nonces are slightly more likely to be selected than others, which could theoretically be exploited by an attacker.
As always just giving info, I forgot a lot about EdDSA, I need to restretch my brain 😅
FIPS 186-5 Appendix A.2.1
ceil(log2(p)) + k
bits. Reducing these mod p
yields a bias of at most 2^-k
. This aligns with targeting k-bit
security.k = 128-bit security
, for a 255-bit prime p
, L = ceil((255 + 128) / 8) = 48 bytes
.DJB's Approach for EdDSA (e.g., Ed25519)
r = mod(sha512(prefix, msg), ed25519.l)
.ed25519.l
) ensures uniform distribution.253+127=380 bits
, rounded to 384 bits (48 bytes)
.function generateSecureKey(curveProperties, securityLevel) {
const requiredBits = Math.ceil(Math.log2(curveProperties.prime)) + securityLevel;
const requiredBytes = Math.ceil(requiredBits / 8);
const randomBytes = getRandomBytes(requiredBytes); // CSPRNG
const key = reduceModPrime(randomBytes, curveProperties.order);
return key;
}
Note: This is a high-level conceptual representation and needs to be adapted to specific curve properties and cryptographic libraries.
Your replies don't answer the questions. I highlight the important parts:
r = mod(sha512(prefix, msg), ed25519.l)
. Why not $2^{-128}$?Fp.BYTES + Math.ceil(Fp.BYTES/2)
, because it's field - not curve group.
To summarize:
I understand your questions better now.
ceil((ceil(log2(p)) + k) / 8)
ensures that the generated number has enough entropy to match the desired security level. ceil((255 + 128) / 8) = 48 bytes
. This is to ensure that the randomness covers the entire range of the curve and has additional bits to meet the 128-bit security strength.mod(sha512(prefix, msg), ed25519.l)
) in EdDSA is considered secure due to the high entropy provided by SHA-512 and the large prime group order.Field vs. Group Byte Length:
Generic Formula:
ceil((bitLengthOfGroup + k) / 8)
bytes.Example Application:
ceil((253 + 127) / 8) = ceil(380 / 8) = 48 bytes
.Do you copy this from chatgpt?
For a curve like secp256k1, which has 32-byte (256-bit) $G$, we take 32+8 bytes (256+64 bits) from CSPRNG and mod-divide it by curve order $n$. This follows FIPS guidelines and has $2^{-64}$ bias:
https://github.com/paulmillr/noble-curves/blob/8c48abe16acddaaabfaf75b9d669f515f129a67d/src/abstract/weierstrass.ts#L847-L855
I think it's fine to adjust the amount of bytes. Several questions still arise:
r = mod(sha512(prefix, msg), ed25519.l)
. Why not $2^{-128}$?Fp.BYTES + Math.ceil(Fp.BYTES/2)
, because it's field - not curve group.