txaty / go-merkletree

Go Merkle Tree. High performance, Supporting parallel run, OpenZeppelin sorting pairs.
https://pkg.go.dev/github.com/txaty/go-merkletree
MIT License
110 stars 17 forks source link

Does this library support Openzeppelin MerkleProof? #39

Open JeremyX2022 opened 3 months ago

JeremyX2022 commented 3 months ago

I find this library sort the hashed leafs two pairs only:

func concatSortHash(b1 []byte, b2 []byte) []byte {
    if bytes.Compare(b1, b2) < 0 {
        return concatHash(b1, b2)
    }
    return concatHash(b2, b1)
}

While this library(github.com/FantasyJony/openzeppelin-merkle-tree-go) sort all the leaves and generate the same root as this js library (https://github.com/OpenZeppelin/merkle-tree)

func (st *StandardTree) MakeTree() ([]byte, error) {

    if len(st.leaves) == 0 {
        return nil, errors.New("Expected non-zero number of leaves")
    }

    // sort hash
    leafValues := make([]*LeafValue, len(st.leaves))
    for k, v := range st.leaves {
        leafValues[k] = v
    }
    sort.Slice(leafValues, func(i, j int) bool {
        return compareBytes(leafValues[i].Hash, leafValues[j].Hash)
    })
       ......

}

I tried to use customer hash function as FantasyJony one by using this config:

func myHashFunc(data []byte) ([]byte, error) {
    k1, err := smt.Keccak256(data)
    if err != nil {
        return nil, err
    }
    k2, err := smt.Keccak256(k1)
    return k2, err
}

I think it's the sorting difference makes the different root value, am I right? (But I tried only two leaves and no sorting, still get different root value.)

If I'm wrong, could you please give me an example code to generate the same result. Thank you.

BTW: It seems your library is fast, but API is not that convenient. If there is an AddLeaf or something like to make it possible to use it in 'iteratoring or streaming style' rather than collect all the data to an array in the first place and pass to this library which allocate arrays inside to waste redundant memory.

JeremyX2022 commented 3 months ago

I see, Openzeppelin merkle tree (https://github.com/OpenZeppelin/merkle-tree) uses an opinionated double leaf hash algorithm.

txaty commented 3 months ago

Hi, thanks for using this library and providing feedbacks.

Yes. This library supports OpenZeppelin Merkle proof. The support was added after https://github.com/txaty/go-merkletree/pull/21. Since in OpenZeppelin's implementation, the hashes are sorted pair-wise during the Merkle tree construction process, you need to set SortSiblingPairs to true in your Config. Also from OpenZeppelin/merkle-tree README, the leaf array is sorted by default, so you may have to sort the data blocks before using it to construct the Merkle Tree. To work around the double leaf hashing, you may use the hashed data as the leaves. Hope this answers your question.

Supporting iteration and streaming is a good proposal and I will consider adding such feature to the library in the future. Thanks.

txaty commented 3 months ago

This will generate the identical root result as https://github.com/FantasyJony/openzeppelin-merkle-tree-go/blob/main/main.go:

package main

import (
    "bytes"
    "fmt"
    "sort"

    smt "github.com/FantasyJony/openzeppelin-merkle-tree-go/standard_merkle_tree"
    "github.com/ethereum/go-ethereum/common/hexutil"
    mt "github.com/txaty/go-merkletree"
)

type dataBlock struct {
    value []interface{}
}

var leafEncodings = []string{
    smt.SOL_ADDRESS,
    smt.SOL_UINT256,
}

func (d *dataBlock) Serialize() ([]byte, error) {
    data, err := smt.AbiPack(leafEncodings, d.value...)
    if err != nil {
        return nil, err
    }
    return smt.Keccak256(data)
}

func main() {
    dataBlocks := []mt.DataBlock{
        &dataBlock{
            value: []interface{}{
                smt.SolAddress("0x1111111111111111111111111111111111111111"),
                smt.SolNumber("5000000000000000000"),
            },
        },
        &dataBlock{
            value: []interface{}{
                smt.SolAddress("0x2222222222222222222222222222222222222222"),
                smt.SolNumber("2500000000000000000"),
            },
        },
        &dataBlock{
            value: []interface{}{
                smt.SolAddress("0x3333333333333333333333333333333333333333"),
                smt.SolNumber("1500000000000000000"),
            },
        },
        &dataBlock{
            value: []interface{}{
                smt.SolAddress("0x4444444444444444444444444444444444444444"),
                smt.SolNumber("1000000000000000000"),
            },
        },
    }

    sort.Slice(dataBlocks, func(i, j int) bool {
        inputBytesI, err := dataBlocks[i].Serialize()
        handleError(err)
        inputBytesJ, err := dataBlocks[j].Serialize()
        handleError(err)
        return bytes.Compare(inputBytesI, inputBytesJ) == 1
    })

    tree, err := mt.New(&mt.Config{
        HashFunc:         smt.Keccak256,
        SortSiblingPairs: true,
    }, dataBlocks)
    handleError(err)

    fmt.Println("MakeTree Root: ", hexutil.Encode(tree.Root))
}

func handleError(err error) {
    if err != nil {
        panic(err)
    }
}
JeremyX2022 commented 3 months ago

I set to 3 leaves, the root value is different since the different handle of odd leaf: copied or left to the next level.

and I changed to 6 leaves:

        {
            smt.SolAddress("0x1111111111111111111111111111111111111111"),
            smt.SolNumber("5000000000000000000"),
        },
        {
            smt.SolAddress("0x2222222222222222222222222222222222222222"),
            smt.SolNumber("2500000000000000000")},
        {
            smt.SolAddress("0x3333333333333333333333333333333333333333"),
            smt.SolNumber("1500000000000000000"),
        },
        {
            smt.SolAddress("0x4444444444444444444444444444444444444444"),
            smt.SolNumber("1000000000000000000"),
        },
        {
            smt.SolAddress("0x5555555555555555555555555555555555555555"),
            smt.SolNumber("10000000000000000"),
        },
        {
            smt.SolAddress("0x6666666666666666666666666666666666666666"),
            smt.SolNumber("100000000000"),
        },

why the root is different?

txaty commented 3 months ago

When calculating the second layer, there will be 3 nodes in total again. Thus the different handles of odd leaf lead to different results. In fact, any leaf numbers that are not $2^n$ will have this issue.

Odd leaf handling in this library is the same as Bitcoin's: https://bitcoin.stackexchange.com/questions/79364/are-number-of-transactions-in-merkle-tree-always-even. The last node in an odd node array is duplicated and hashed with itself.

JeremyX2022 commented 3 months ago

When calculating the second layer, there will be 3 nodes in total again. Thus the different handles of odd leaf lead to different results. In fact, any leaf numbers that are not 2n will have this issue.

Odd leaf handling in this library is the same as Bitcoin's: https://bitcoin.stackexchange.com/questions/79364/are-number-of-transactions-in-merkle-tree-always-even. The last node in an odd node array is duplicated and hashed with itself.

you are right. thanks!