Closed BasileiosKal closed 1 year ago
Mechanically, the reason for the limit is that we encode len_in_bytes
in two bytes as part of the input to the hash function, which limits the representable value to 2^16 - 1. (But I'm sure you noticed this already! 😄)
As to why only two bytes: that's already far beyond what we need, so we didn't allocate more space in the input for it.
Secondarily: if you're using expand_message_xmd, the counter in the expand loop (steps 9 and 10) is only 1 byte, which will also limit reachable output length. No trouble of this type if you're using expand_message_xof.
If you want to support longer outputs, one easy possibility is to allocate more bytes to the len_in_bytes encoding. For example, replacing step 3 of expand_message_xof with
3. msg_prime = msg || I2OSP(len_in_bytes, 8) || DST_prime
would support messages up to 2^64-1 in length.
expand_message_xmd would take more surgery---at least, increasing the bit width of the encoding of i
on lines 8 and 10. I think that would suffice and would be secure, but I haven't thought about it enough to be sure.
As an alternative that would let you treat expand_message (either variant) as a black-box, you could chain together a bunch of expand_message invocations. Something like this probably works (but do some more analysis to be sure!):
# assume total_output_length, msg, and DST are defined in the obvious way
numexp = ceil(total_output_length / 65284)
ABORT if numexp >= 2^32
DST_next = I2OSP(0, 4) || DST
for j in (1, ..., numexp):
tmp = expand_message(msg, DST_c, 65535)
out_j = substr(tmp, 0, 65284)
DST_next = I2OSP(j, 4) || substr(tmp, 65284, 251)
result = out_1 || ... || out_numexp
return substr(result, 0, total_output_length)
This lets you produce nearly 2^48 bytes.
Great! Thank you! That’s a very interesting solution. Thanks for taking the time for this.
Closing since my question was thoroughly answered! Thanks again!
Hello everyone! Sorry for the naive question. I'm trying to understand the reasons behind the upper limit of
expand_message
output length. Are there security reasons for it? or it is only for efficiency??For context, we are working on a bbs-signatures draft, and this limit has caused an issue that we are currently discussing.
Any hints will be greatly appreciated.