celestiaorg / smt

A Go library that implements a Sparse Merkle tree for a key-value map.
https://godoc.org/github.com/celestiaorg/smt
MIT License
138 stars 53 forks source link

Add string version for SimpleMap methods #43

Closed cuonglm closed 3 years ago

cuonglm commented 3 years ago

Currently, all methods of SimpleMap get in a []byte then convert it to string as key. This is ok for GET/DELETE operation, since when the compiler can optimize map lookup to not allocate new string in string([]byte) conversion.

For Set operation, which involve map assign, the string([]byte) conversion will allocate new string. If we instead do the conversion in the caller side, and pass in the string, the we can save allocation.

I did a quick version:

func (sm *SimpleMap) SetString(key string, value []byte) error {
    sm.m[key] = value
    return nil
}

For following benchmarks:

package smt

import (
    "testing"
)

func BenchmarkSimpleMap_Set(b *testing.B) {
    sm := NewSimpleMap()
    s := "foobar"
    bs := []byte(s)

    for i := 0; i < b.N; i++ {
        sm.Set(bs, bs)
    }
}

func BenchmarkSimpleMap_SetString(b *testing.B) {
    sm := NewSimpleMap()
    s := "foobar"
    bs := []byte(s)

    for i := 0; i < b.N; i++ {
        sm.SetString(s, bs)
    }
}

The result:

name                   time/op
SimpleMap_Set-8        18.0ns ± 0%
SimpleMap_SetString-8  8.98ns ± 0%

name                   alloc/op
SimpleMap_Set-8         8.00B ± 0%
SimpleMap_SetString-8   0.00B

name                   allocs/op
SimpleMap_Set-8          1.00 ± 0%
SimpleMap_SetString-8    0.00
cuonglm commented 3 years ago

cc @odeke-em

cuonglm commented 3 years ago

Note that only the SetString matter, the DeleteString and GetString are just the same as Delete and Get, but for completeness of APIs, better to introduce the string version for all methods.

If we don't care about backward-compatible, then we can just change the current methods signature, and also un-export SimpleMap, too, as it's highly coupled to internal usage only.

adlerjohn commented 3 years ago

The provided SimpleMap only exists so users can use the library with no external dependencies, e.g. for testing or prototyping purposes. It's absolutely not meant to be used in production; in production, a k-v store like RocksDB will be used that implements the MapStore interface, and for that it's important that the key used is a byte slice []byte since the key will be a hash or something.

https://github.com/celestiaorg/smt/blob/85c78b7e80b8c70342a662f0b2fd9f883fd59057/mapstore.go#L8-L12

TL;DR: no, the MapStore interface should not be changed to use a string as the key. And SimpleMap should not be removed if we want to allow users to start using the library with no external dependencies to at least prototype.

Also, your new benchmark is highly synthetic as it uses the same key over and over. In practice, that won't be the case. Instead, every operation will use a different key. And each key being different means allocating a new slice each time, something that your benchmark omits. If you allocate a new (different) string each iteration, you'll probably see the benchmarks being much closer together.

cuonglm commented 3 years ago

Hmm, my bad, as SimpleMap is only used for testing.