OpenZeppelin / merkle-tree

A JavaScript library to generate merkle trees and merkle proofs.
MIT License
453 stars 109 forks source link

Customize leaf hash function #13

Closed artemstrpcv closed 7 months ago

artemstrpcv commented 1 year ago

https://github.com/OpenZeppelin/merkle-tree/blob/b3d988984debfd6c951b8f1799982f191ff5b792/src/standard.ts#L9

In the provided example, the desired algorithm for getting leaves is:

bytes32 leaf = keccak256(bytes.concat(keccak256(abi.encode(addr, amount))));

And this works perfectly fine when there is more than 1 value inside the leaf.

However, when the value inside the leaf is singular, then there is no need for extra concat and keccak256, it only makes gas overhead. And the preferred algorithm is:

bytes32 leaf = keccak256(abi.encodePacked(addr));

This makes make a big difference because there is no possibility to change the leaf algorithm that this library uses, hence all trees and proofs for them can't be synced with the contract's behavior.

frangio commented 1 year ago

However, when the value inside the leaf is singular, then there is no need for extra concat and keccak256

This is not necessarily true. The reason the leaf is double-hashed is to mitigate second preimage attacks. If the leaf is a single bytes32 value the tree is likely vulnerable to those attacks, so I would say those are in fact the cases when it is particularly necessary to do double hashing.

If the singular value is an address and you use encodePacked then the tree is not vulnerable and you could omit the double hash.

The issue is not that standardLeafHash works incorrectly, it's that you want a different leaf hashing algorithm.

Amxx commented 9 months ago

Want to bring that up.

I think we should support cases where:

In that case, we check that the leaves are all 32 bytes, and we use them directly, without any hashing.

This will add support for leaves that are produced using a "non standard hash"

frangio commented 9 months ago

I think customizing the leaf hash (including unhashed leaves) conflicts with the concept of "Standard" Merkle Tree that this class intends to implement. It makes me think that it should remain standard, and at most give a few preset options for hashing. However, using the core API directly is a lot more cumbersome so I see the point in a nicer wrapper with an interface like StandardMerkleTree. There could be a base MerkleTree class with the nice API for that.

Something like this?

type Hasher = <T>(leaf: T, encoding: E) => Bytes;

export class MerkleTree<T, E> {
  static of<T, E>(values: T[], leafEncoding: E, hasher: Hasher<T, E>) {
    ...
  }

  ...
}

leafEncoding and hasher could be optional arguments to support passing pre-hashed leaves but I wouldn't do that. It can always be done by just passing in the identity function as the "hasher".

Amxx commented 9 months ago

I did make a local branch with

type LeafLike = any[] | BytesLike;
type Encoding<T extends LeafLike> = T extends any[] ? string[] : undefined;

export class MerkleTree<T extends LeafLike> {
    static of<T, E>(values: T[], leafEncoding: Encoding<T>) {
        ...
    }
}

And a hashing function that does:

Amxx commented 9 months ago

My issue with the hasher, is that I'm not sure there is a way to dump it / load it.

We would rely on the loading part getting the correct hasher as an argument (we could rehash all the leafs to check its correct ... or disable the hashLeaf function after construction)

frangio commented 9 months ago

My issue with the hasher, is that I'm not sure there is a way to dump it / load it.

Ah, that's true. This would be a point in favor of supporting a predefined set of options. 'double' | 'single' | 'passthrough'

Amxx commented 7 months ago

36 provides a SimpleMerkleTree that support can be used for that. Users of the SimpleMerkleTree are responsible of doing the hasing themselves.