Attacking unbalanced binary merkle trees (CVE-2012-2459)

OR13 commented 2 years ago

does salting members before construction prevent forgery attacks on unbalanced binary merkle trees?

Would be nice to cite a source regarding this.

rfc9162 seems unconcerned, but their tree compute algorithm is different than BTC and our proposal here.

https://datatracker.ietf.org/doc/html/rfc9162#section-2.1.1

joelkek commented 2 years ago

I looked into CVE-2012-2459 (best description I found was https://bitcointalk.org/?topic=102395).

The vulnerability happens because of flaws in Bitcoin's Merkle hash implementation:

The Merkle hash implementation that Bitcoin uses to calculate the Merkle root in a block header is flawed in that one can easily construct multiple lists of hashes that map to the same Merkle root. For example, merkle_hash([a, b, c]) and merkle_hash([a, b, c, c]) yield the same result. This is because, at every iteration, the Merkle hash function pads its intermediate list of hashes with the last hash if the list is of odd length, in order to make it of even length.

Their band-aid fix was to prevent duplicate transaction IDs (commit https://github.com/bitcoin/bitcoin/commit/be8651dde7b59e50e8c443da71c706667803d06d)

To prevent this from happening in the first place, we can define a Merkle hash implementation where instead of duplicating unbalanced elements, we simply promote them to the next level of the tree.

This is the approach used by other Merkle hash implementations e.g. THEX (source: https://adc.sourceforge.io/draft-jchapweske-thex-02.html)

For trees that are unbalanced -- that is, they have a number of leaves which is not a power of 2 -- interim hash values which do not have a sibling value to which they may be concatenated are promoted, unchanged, up the tree until a sibling is found.

OR13 commented 2 years ago

The implementation we have here

https://github.com/transmute-industries/verifiable-data/tree/main/packages/merkle-proof

Promotes children, but also, all leaves are computed from a deterministic nonce which is a function of a random nonce for all messages... this is necessary to prevent brute forcing of the neighbors when a proof of inclusion is disclosed.

Since the original implementation did not have this behavior, I am not sure if CVE-2012-2459 applies.

lawwman commented 2 years ago

I have reviewed the above implementations with known vulnerability when receiving an unbalanced Merkle binary tree. They also include the flaw in implementation that @joelkek has identified (Merkle hash function pads its intermediate list of hashes with the last hash if the list is of odd length, in order to make it of even length.)

The vulnerability stems from attackers being able to create a second pre-image by "duplicating members" (more specifically, duplicating the last member from a list of odd length) to form an artificially balanced tree.

Given that your implementation does not have this behaviour of padding list of hashes, I don't think CVE-2012-2459 applies.

I would like to also highlight the added benefit of the messageNonce in your implementation (calculateMessageNonce(m, i, rootNonce, hash)). Note that the messageNonce also takes into account each member's index position i.

If we were to extend this messageNonce behaviour to the above vulnerable implementations, the "duplicating members" attack would not work because the duplicated member would result in a different nonce due to the different index position i. saltedMember would have a different value. Hence the resulting merkle root should be different.

w3c-ccg / Merkle-Disclosure-2021

Attacking unbalanced binary merkle trees (CVE-2012-2459) #3