cfrg / draft-irtf-cfrg-hash-to-curve

Hashing to Elliptic Curves
Other
79 stars 27 forks source link

Padding in expand_message_xmd test vectors #265

Closed mratsim closed 4 years ago

mratsim commented 4 years ago

I may be missing something obvious but from the https://github.com/cfrg/draft-irtf-cfrg-hash-to-curve/pull/259 vectors (great idea) I have an issue with msg_prime padding:

Vector:

name    = expand_message_xmd
DST     = QUUX-V01-CS02-with-expander
hash    = SHA256
security_param = 128

msg     =
DST_prime = 515555582d5630312d435330322d776974682d657870616e646572
          1b
msg_prime = 000000000000000000000000000000000000000000000000000000
          000000000000000000000000000000000000000000000000000000
          000000000000000000000000000000000000000000000000000000
          000000000000000000000000000000000000000000000000000000
          000000000000000000000000000000000000000000200051555558
          2d5630312d435330322d776974682d657870616e6465721b
uniform_bytes = 2eaa1f7b5715f4736e6a5dbe288257abf1faa028680c1d938cd62a
          c699ead642

Spec:

expand_message_xmd(msg, DST, len_in_bytes)

Parameters:
- H, a hash function (see requirements above).
- b_in_bytes, ceil(b / 8) for b the output size of H in bits.
  For example, for b = 256, b_in_bytes = 32.
- r_in_bytes, the input block size of H, measured in bytes.
  For example, for SHA-256, r_in_bytes = 64.

Input:
- msg, a byte string.
- DST, a byte string of at most 255 bytes.
  See below for information on using longer DSTs.
- len_in_bytes, the length of the requested output in bytes.

Output:
- uniform_bytes, a byte string

Steps:
1.  ell = ceil(len_in_bytes / b_in_bytes)
2.  ABORT if ell > 255
3.  DST_prime = DST || I2OSP(len(DST), 1)
4.  Z_pad = I2OSP(0, r_in_bytes)
5.  l_i_b_str = I2OSP(len_in_bytes, 2)
6.  msg_prime = Z_pad || msg || l_i_b_str || I2OSP(0, 1) || DST_prime
7.  b_0 = H(msg_prime)
8.  b_1 = H(b_0 || I2OSP(1, 1) || DST_prime)
9.  for i in (2, ..., ell):
10.    b_i = H(strxor(b_0, b_(i - 1)) || I2OSP(i, 1) || DST_prime)
11. uniform_bytes = b_1 || ... || b_ell
12. return substr(uniform_bytes, 0, len_in_bytes)
~~~
  1. We can clearly see at the end of the msg_prime this part l_i_b_str || I2OSP(0, 1) || DST_prime 002000515555582d5630312d435330322d776974682d657870616e6465721b
  2. However preceding this 0020... we have 4*54+40 zeros for a total of 256.
  3. As the msg is empty, those come from Z_pad.
  4. Z_pad uses the hash block size, for SHA256 (the test vector config), it's 64 bytes.
  5. In hex 64 zero bytes would lead to 128 zeros
  6. We have 256 zeros, was z_pad mistakenly duplicated?
kwantam commented 4 years ago

Hmm. I just checked that the code does the right thing, so I am guessing the disconnect is in the way things are getting translated into the test vectors (which is good---much worse if the code is broken!).

I may not have time to look into this until tomorrow afternoon or Thursday, but I'll get back to you asap.

Thanks for the report!

kwantam commented 4 years ago

Ah, well, here's an obvious issue: the SHA-256 and SHA-512 test vectors are identical. I'm guessing this means what we're listing as SHA-256 test vectors are not.

kwantam commented 4 years ago

Aha! found it. I'll open a PR now.