celestiaorg / smt

A Go library that implements a Sparse Merkle tree for a key-value map.
https://godoc.org/github.com/celestiaorg/smt
MIT License
138 stars 53 forks source link

Expose leaf node with smt.GetLeaf() #54

Closed roysc closed 3 years ago

roysc commented 3 years ago

This will allow access to the contents of a leaf node, including the node path and hashed value.

This will support an update to Cosmos SDK ADR-040 by allowing access to the hashed value and path without any redundant hashing.

Note: should we optimize this to allow fetching the leaf node directly rather than descending the tree, as has been done for value lookup?

codecov-commenter commented 3 years ago

Codecov Report

Merging #54 (56045a8) into master (2c90766) will decrease coverage by 1.43%. The diff coverage is 65.38%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #54      +/-   ##
==========================================
- Coverage   86.05%   84.61%   -1.44%     
==========================================
  Files           6        6              
  Lines         466      494      +28     
==========================================
+ Hits          401      418      +17     
- Misses         37       43       +6     
- Partials       28       33       +5     
Impacted Files Coverage Δ
smt.go 79.21% <65.38%> (-1.58%) :arrow_down:
deepsubtree.go 61.11% <0.00%> (-2.36%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2c90766...56045a8. Read the comment docs.

roysc commented 3 years ago

The linked PR is just to change the SMT stored value from the h(k, v) to h(v) (where k, v are the state store KV record). Since the decoupled state store will also store a reverse index mapping the h(k) path back to the original k, we want a way to get the path without performing a redundant hash on k. (Though with a change is planned in https://github.com/celestiaorg/smt/issues/40 to map keys directly in the SMT without hashing, in which case that should be unnecessary.)

We also want to use h(v) to index values in IPLD (not my domain, @i-norden could shed more light on that), so likewise, we want access to the leaf to avoid hashing v again.

adlerjohn commented 3 years ago

we want a way to get the path without performing a redundant hash on k we want access to the leaf to avoid hashing v again.

I 100% guarantee you it's cheaper to perform single hashes than to execute this function (which will perform log(n) hashes and disk reads). By several orders of magnitude.

https://github.com/celestiaorg/smt/blob/56045a88b450b04bedd4a8cebabefa73063dce46/smt.go#L103-L148

roysc commented 3 years ago

Good point. Even a single disk read might be slower, so optimizing it may also be pointless. I'll confer with Ian and cosmos and see what the actual access pattern will look like.

liamsi commented 3 years ago

We also want to use h(v) to index values in IPLD

Can you point me to how you use IPLD? I assume you also store the inner nodes in IPFS, too?

roysc commented 3 years ago

@liamsi I believe so but I'm not certain - @i-norden would have more info. To my knowledge what we need is to iterate over the SMT's nodes (leaf and probably internal) and retrieve their contents - in which case, this GetLeaf accessor won't be needed anyway. I will close this PR.