This thread will be an ongoing collection of issues with the BLAKE3 design (not the implementation) that have been reported since release. The BLAKE3 paper itself isn't indended to be a living document, so folks working on future hash function designs might want to consider this thread an appendix to the paper. To be clear, we aren't aware of any major issues leading to practical attacks.
Block-Cipher-Based Tree Hashing by Aldo Gunsing shows that an attacker who knows the input and the key (if any) can reverse the extended-output compression function to discover its other inputs. This includes the final block length (which this attacker already knows), the mode flags (which this attacker presumably knows), and the XOF output offset. The exposure of that offset isn't ideal, though for comparison, AES-CTR has a similar property when the attacker knows the AES key. We could've forced an attacker to brute force these parameters by feeding them forward into the final output, the same way the input state words are fed forward. This would be a very niche scenario to design for, and the brute force attack here would be practical in any case. But still it might've been nice to have, and the cost would've been low, just a handful of XORs at the end of the compression function. This issue was documented as a security note in https://github.com/BLAKE3-team/BLAKE3/commit/ea3bc782d8128d7f52008d459ecd4df8b51979cf.
Antonio Marcedone and Balachandar Kesavan found a mistake in the Bao security argument related to seeking. In short, the chunk counter is important for secure seeks, but we weren't aware of that back when added it. The current design might still be the cleanest way to achieve what we need to achieve, but it's also possible that we'll spot some other issues as we try to formalize it. Here's the detailed writeup: https://github.com/oconnor663/bao/issues/41
Here's a part of the paper that could probably be corrected/clarified:
In section 5.3, I described the difference in performance between the row-based SIMD approaches (the first and third approaches in that section) and the word-based SIMD approach (the second) in terms of the overhead cost of diagonalization. That cost isn't zero, but actually most of the difference probably comes from instruction-level parallelism in the SIMD execution units.
This thread will be an ongoing collection of issues with the BLAKE3 design (not the implementation) that have been reported since release. The BLAKE3 paper itself isn't indended to be a living document, so folks working on future hash function designs might want to consider this thread an appendix to the paper. To be clear, we aren't aware of any major issues leading to practical attacks.