cfrg / draft-irtf-cfrg-hash-to-curve

Hashing to Elliptic Curves
Other
78 stars 27 forks source link

Q: expand_message output len limit #349

Closed BasileiosKal closed 1 year ago

BasileiosKal commented 1 year ago

Hello everyone! Sorry for the naive question. I'm trying to understand the reasons behind the upper limit of expand_message output length. Are there security reasons for it? or it is only for efficiency??

For context, we are working on a bbs-signatures draft, and this limit has caused an issue that we are currently discussing.

Any hints will be greatly appreciated.

kwantam commented 1 year ago

Mechanically, the reason for the limit is that we encode len_in_bytes in two bytes as part of the input to the hash function, which limits the representable value to 2^16 - 1. (But I'm sure you noticed this already! 😄)

As to why only two bytes: that's already far beyond what we need, so we didn't allocate more space in the input for it.

Secondarily: if you're using expand_message_xmd, the counter in the expand loop (steps 9 and 10) is only 1 byte, which will also limit reachable output length. No trouble of this type if you're using expand_message_xof.

If you want to support longer outputs, one easy possibility is to allocate more bytes to the len_in_bytes encoding. For example, replacing step 3 of expand_message_xof with

3. msg_prime = msg || I2OSP(len_in_bytes, 8) || DST_prime

would support messages up to 2^64-1 in length.

expand_message_xmd would take more surgery---at least, increasing the bit width of the encoding of i on lines 8 and 10. I think that would suffice and would be secure, but I haven't thought about it enough to be sure.

kwantam commented 1 year ago

As an alternative that would let you treat expand_message (either variant) as a black-box, you could chain together a bunch of expand_message invocations. Something like this probably works (but do some more analysis to be sure!):

# assume total_output_length, msg, and DST are defined in the obvious way
numexp = ceil(total_output_length / 65284)
ABORT if numexp >= 2^32
DST_next = I2OSP(0, 4) || DST
for j in (1, ..., numexp):
    tmp = expand_message(msg, DST_c, 65535)
    out_j = substr(tmp, 0, 65284)
    DST_next = I2OSP(j, 4) || substr(tmp, 65284, 251)
result = out_1 || ... || out_numexp
return substr(result, 0, total_output_length)

This lets you produce nearly 2^48 bytes.

BasileiosKal commented 1 year ago

Great! Thank you! That’s a very interesting solution. Thanks for taking the time for this.

Closing since my question was thoroughly answered! Thanks again!