celestiaorg / smt

A Go library that implements a Sparse Merkle tree for a key-value map.

https://godoc.org/github.com/celestiaorg/smt

MIT License

138 stars 53 forks source link

adds support for raw key instead of hashed key #64

Open rach-id opened 2 years ago

rach-id commented 2 years ago

Fixes https://github.com/celestiaorg/smt/issues/40. It adds a keySize field to all SMT related structs and uses it to access values in smt.values map.

Maybe we should add more checks on the keySize so that the treeHasher, SparseMerkleProof, SparseMerkleTree etc, have all the same key size.

Or, we can keep the keySize only inside the treeHasher and no checks will be required: Pros Simpler codebase

Cons keySize is more related to the trees themselves than the hasher...

Also, other checks would be needed when doing operations like Get(key) checking if the key has the correct size.

Update: The keySize is now part of the MapStore and it is being passed around as parameter wherever needed.

rach-id commented 2 years ago

@adlerjohn Please take a look and let me know if this is what is needed so I continue fixing the other tests etc. Currently, only the TestSparseMerkleTreeUpdateBasic is working

codecov-commenter commented 2 years ago

Codecov Report

Merging #64 (a0cc481) into master (5f13c0f) will increase coverage by 3.54%. The diff coverage is 95.29%.

@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
+ Coverage   85.62%   89.16%   +3.54%     
==========================================
  Files           6        6              
  Lines         466      480      +14     
==========================================
+ Hits          399      428      +29     
+ Misses         39       26      -13     
+ Partials       28       26       -2

Impacted Files	Coverage Δ
deepsubtree.go	`71.92% <90.90%> (+10.81%)`	:arrow_up:
proofs.go	`98.01% <92.59%> (-1.99%)`	:arrow_down:
smt.go	`85.90% <94.73%> (+5.11%)`	:arrow_up:
mapstore.go	`94.11% <100.00%> (+5.22%)`	:arrow_up:
treehasher.go	`100.00% <100.00%> (ø)`
utils.go	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6e634fe...a0cc481. Read the comment docs.

rach-id commented 2 years ago

Questions: 1) Should the ErrWrongKeySize be also a struct ? So the error message also contains the wrong key. Or no need for that ? 2) Should we update the fuzz test or we no longer need it ? since it will always be a constant size key... What about generating tons of constant sized strings ?

rach-id commented 2 years ago

I left some comments in the code.

In general, I think that default behavior of the library should be preserved (enable users to use arbitrary keys), and raw keys support should be actually added (instead of replacing current way of operation).

This can be added as an Option, that changes the behavior of treeHasher.path method to no-op.

Thanks for the comments, will address them tomorrow. For the functionality, I think the issue states replacing that behaviour, it's a breaking change, I believe. Additionally, I think that the tree hasher should no longer be passed to these structures, as it is only hashing everything internally... What can be done is using a default hasher that can be changed using options, for example, but no longer be necessary. What do you think ?

tzdybal commented 2 years ago

Just for clarification: I totally agree, that raw keys can be very useful, I understand the logic behind this change. But IMHO it's a low hanging fruit, to be able to use SMT with both raw (advanced use, turned on by option) and hashed keys (by default). This way, library can be more general-purpose, more developer/user friendly.

rach-id commented 2 years ago

@tzdybal sounds good to have both, yes. I can start digging into it as soon as I get the green light :+1:

adlerjohn commented 2 years ago

Having the option for both modes isn't bad IMO. Having the option for raw keys is a must, beyond that extra features are gravy on top.

rach-id commented 2 years ago

@adlerjohn should I get going with the implementation ? in // with the explorer stuff ? Which one should be the default behaviour ? raw keys or hashed keys ?

Also, what do you think about these ?

Questions:

Should the ErrWrongKeySize be also a struct ? So the error message also contains the wrong key. Or no need for that ?

Should we update the fuzz test or we no longer need it ? since it will always be a constant size key... What about generating tons of constant sized strings ?

I think that the tree hasher should no longer be passed to these structures, as it is only hashing everything internally... What can be done is using a default hasher that can be changed using options, for example, but no longer be necessary.

rach-id commented 2 years ago

Stopping work on this for now. Will pick it up later.