cfrg / draft-irtf-cfrg-hash-to-curve

Hashing to Elliptic Curves
Other
78 stars 27 forks source link

Input limits #336

Open chris-wood opened 2 years ago

chris-wood commented 2 years ago

The document currently doesn't note input limits for hash_to_field, and I'm wondering if we should consider adding these limits. HPKE has similar limits for some of its functions.

The limit is ultimately defined by the underlying hash function, which, for hash_to_field built on expand_message_xmd is an MD-style hash with an explicit limit, but for hash_to_field built on expand_message_xof is a XOF-based construction with no explicit limit. (FIPS 202 says that the message is a "bit string of any length that is the input to a SHA-3 function.")

In practice, hitting these limits is unlikely to happen, since SHA-256 and SHA-512 (and SHA-384) have limits of 2^61 and 2^125 (2^125), respectively. However, I wonder if we should note this for expand_message_xmd.

Here's the definition of expand_message_xmd:

1.  ell = ceil(len_in_bytes / b_in_bytes)
2.  ABORT if ell > 255 or len_in_bytes > 65535 or len(DST) > 255
3.  DST_prime = DST || I2OSP(len(DST), 1)
4.  Z_pad = I2OSP(0, s_in_bytes)
5.  l_i_b_str = I2OSP(len_in_bytes, 2)
6.  msg_prime = Z_pad || msg || l_i_b_str || I2OSP(0, 1) || DST_prime
7.  b_0 = H(msg_prime)
... snip ....

Based on this, the length of the input to H, denoted msg_prime, is computed as:

len(msg_prime) = r_in_bytes (* r_in_bytes *) + 2 (* l_i_b_str *) + 1 (* I2OSP(0, 1) *) + 1 (* I2OSP(len(DST), 1) *) + len(DST) + len(msg)

Based on the limit of H, denoted H_limit, we could say that the limit of inputs to expand_message_xmd for suites that use this limit is:

H_limit - r_in_bytes + 4 + len(DST) + len(msg)

@kwantam, what do you think?

kwantam commented 2 years ago

Since the bounds are enormous, maybe it makes sense just to put in a quick reminder that one should respect the input size limit, but not go so far as to give an expression for it. Or maybe not. Not clear to me...

chris-wood commented 2 years ago

Yeah, at a minimum, noting the limit exists seems necessary. We can ask the list to see if folks think an expression quantifying it would be additionally helpful (or harmful).