cosmos / iavl

Merkleized IAVL+ Tree implementation in Go
Apache License 2.0
417 stars 263 forks source link

Add BatchWithFlusher to iavl tree #771

Closed catShaark closed 1 year ago

catShaark commented 1 year ago

Closes #619

catShaark commented 1 year ago

still needs profiling on live osmosis node to compare the results

tac0turtle commented 1 year ago

will this be able to be part of the next release or should we wait for the next release (unclear when that will be)

catShaark commented 1 year ago

This won't be state breaking so if it's ready we can add this to the next release. But I get errors backporting this to v0.19.x

tac0turtle commented 1 year ago

gentle ping

catShaark commented 1 year ago

I figure out the deadlock issue is due to the mem db iterator being channel based, the iterator will lock the db during iteration which inturn cause the deadlock. There're already issue on cosmos-db related to this https://github.com/cosmos/cosmos-db/issues/56, https://github.com/cosmos/cosmos-db/issues/47

tac0turtle commented 1 year ago

The sdk doesn't use memdb anymore. Maybe we have our own memdb here

catShaark commented 1 year ago

So I have run benchmark on a live osmosis node. Comparing normal iavl batch vs batch with flusher (flush to state at 100KB), the result is almost the same for the case of normal blocks ( blocks that don't cause heavy commit to state ) but for epoch blocks that cause heavy commit, the commit time is reduced by 25%.

catShaark commented 1 year ago
Screen Shot 2023-07-14 at 14 08 14

running 10 blocks including epoch block with 100KB flush

Screen Shot 2023-07-14 at 14 17 39

running 10 blocks including epoch block with normal iavl batch

catShaark commented 1 year ago

the benchmark also proves that iavl with batch is not state breaking. I've run 10000 blocks with no app hash error so far

tac0turtle commented 1 year ago

amazing this is super useful!!

tac0turtle commented 1 year ago

gentle ping on this pr, if you are running this on mainnets then it should be good to merge?

catShaark commented 1 year ago

yes, we can have this on mainnets. Do we want the flush thresh hold to be configurable. If so then we have to change iavl.NewMutableTreeWithOpts which is used in sdk root multi logic, that means we also need to make changes to the sdk

catShaark commented 1 year ago
here is the commit time of a osmosis epoch block : Flush threshold Commit time
25 27.17s
50 26.72s
100 26.8s
250 26.19s
500 27.5s
750 26.76s
1000 41.83s
no-flush 49.61s

Seems like we got the same performance in the range from 25 KB to 750KB. I think we should choose 100KB to be the flush threshold.

tac0turtle commented 1 year ago

gentle ping, need help completing this?

catShaark commented 1 year ago

thanks for pinging, I will finish this by tomorow

catShaark commented 1 year ago

I've made a new PR for this https://github.com/cosmos/iavl/pull/807