cosmos / iavl

Merkleized IAVL+ Tree implementation in Go
Apache License 2.0
417 stars 263 forks source link

feat: Support concurrency for IAVL and fix Racing conditions #805

Closed mattverse closed 1 year ago

mattverse commented 1 year ago

cref: https://github.com/cosmos/iavl/issues/696

This PR fixes the problem stated above by replacing maps that causes racing conditions to sync map to be thread safe.

Tested this on various chains' nodes that's having concurrency issue by applying patch manually, confirmed that after applying this patch concurrency issue is not happening any more. (special shout out to @keplr-wallet for helping out testing!)

Also confirmed and tested on a testnet node with the following procedure:

mattverse commented 1 year ago

Oh sweet! I'll get back tmrw with the benchstats @p0mvn suggested

yihuang commented 1 year ago

https://medium.com/@nikolay.dimitrov/comparing-go-performance-of-mutex-sync-map-and-channels-for-syncing-a-map-of-counters-61810c201e17

I just run this benchmark script locally, I get results like this:

Start measuring Mutex ...
Finished measuring Mutex, time taken: 928.267951ms

Start measuring sync.Map ...
Finished measuring sync.Map, time taken: 3.284460863s

Start measuring Channels ...
Finished measuring Channels, time taken: 3.524194982s

so it seems sync.Map is slower than a mutex protected normal map, so how about we simply add a good old mutex to the mutable tree?

mattverse commented 1 year ago

@yihuang Previously, I have attempted to use good old mutexes instead of sync.Maps(prev attempt here https://github.com/mattverse/iavl/commit/34d4565e34a56c663a88ae08521f58628b954c36) , but I have changed to sync.Map for couple of reasons.

  1. Using mutexes gives us the overhead of having to manage where we lock and unlock manually. Code becomes very very dirty as well as since we are manually locking & unlocking we might miss parts that actually cause concurrency issues. In my previous attempts to use mutexes, although I did use mutex locks and unlocks in places where I thought concurrency issue was happening, it did not seem to solve problem (although it did relieve some of the concurrency issue happening). I am suspecting this is because I probably have missed locking and unlocking mutexes in {unknown}{unknown} parts of the code

  2. The bench test given does not seem fair for comparing sync maps and mutexes. Sync Maps are especially perf wise better upon lower writes, higher reads. However, given bench test seems to iteratively write, which would indeed result in better bench stats for mutexes

lmk your thoughts on this!

cool-develope commented 1 year ago

I am still confused about the parallel use-case of iavl, the parallel writing will break the tree structure, so the root hash will not be deterministic. Actually, the IAVL doesn't provide concurrency now. @yihuang, do you think it is a real issue?

yihuang commented 1 year ago

I am still confused about the parallel use-case of iavl, the parallel writing will break the tree structure, so the root hash will not be deterministic. Actually, the IAVL doesn't provide concurrency now.

@yihuang, do you think it is a real issue?

I guess it's concurrent reading and writing, still single writer but multiple readers?

tac0turtle commented 1 year ago

I am still confused about the parallel use-case of iavl, the parallel writing will break the tree structure, so the root hash will not be deterministic. Actually, the IAVL doesn't provide concurrency now. @yihuang, do you think it is a real issue?

this is due to the fastnode system, there is an issue with reading and writing that causes a concurrent hashmap error. This aims to fix the issue. Its less so about using the iavl tree in parallel.

yihuang commented 1 year ago

my concern is consistency, nodedb use mutex to protect some maps operations, but here we use sync.Map, we should use a consistent approach towards the concurrency?

cool-develope commented 1 year ago

my concern is consistency, nodedb use mutex to protect some maps operations, but here we use sync.Map, we should use a consistent approach towards the concurrency?

yeah, I think RWMutex is more reasonable in this case

mattverse commented 1 year ago

ok, if that's the consensus, I'm down to change it to use sync.RWMutex and re-open PR!

tac0turtle commented 1 year ago

@mattverse any chance you can make the new pr? otherwise we can do it

mattverse commented 1 year ago

@tac0turtle Code is ready, currently testing, give me 1~2 more days!

mattverse commented 1 year ago

I'm half skeptical it would work, applying RW mutex was pain, alot of places caused deadlocks

mattverse commented 1 year ago

yep getting error

{"level":"info","module":"consensus","commit_round":0,"height":2277484,"commit":"CC3E91B29F5D0820397C70A0EA459E5DA2AEC9F83B006597BAB2D7FA79B00A29","proposal":{},"time":"2023-08-21T11:08:01Z","message":"commit is for a block we do not know about; set ProposalBlock=nil"}
{"level":"info","module":"consensus","hash":"CC3E91B29F5D0820397C70A0EA459E5DA2AEC9F83B006597BAB2D7FA79B00A29","height":2277484,"time":"2023-08-21T11:08:01Z","message":"received complete proposal block"}

state.go:1563 +0x2ff\ngithub.com/tendermint/tendermint/consensus.(*State).handleCompleteProposal(0xc000f1bc00, 0xc0013cca00?)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1942 +0x399\ngithub.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc000f1bc00, {{0x2a9ea40, 0xc0099e38a8}, {0xc00316f650, 0x28}})\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:834 +0x1b7\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc000f1bc00, 0x0)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:760 +0x3f0\ncreated by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:379 +0x12d\n","time":"2023-08-21T11:08:01Z","message":"CONSENSUS FAILURE!!!"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{}},"msg":{},"time":"2023-08-21T11:08:01Z","message":"service stop"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{},"ID":"group:U/panic.go:884 +0x213\ngithub.com/tendermint/tendermint/consensus.(*State).finalizeCommit(0xc000f1bc00, 0x22c06c)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1594 +0x1105\ngithub.com/tendermint/tendermint/consensus.(*State).tryFinalizeCommit(0xc000f1bc00, 0x22c06c)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1563 +0x2ff\ngithub.com/tendermint/tendermint/consensus.(*State).handleCompleteProposal(0xc000f1bc00, 0xc0013cca00?)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1942 +0x399\ngithub.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc000f1bc00, {{0x2a9ea40, 0xc0099e38a8}, {0xc00316f650, 0x28}})\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:834 +0x1b7\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc000f1bc00, 0x0)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:760 +0x3f0\ncreated by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:379 +0x12d\n","time":"2023-08-21T11:08:01Z","message":"CONSENSUS FAILURE!!!"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{}},"msg":{},"time":"2023-08-21T11:08:01Z","message":"service stop"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{},"ID":"group:UL0yDN8TXC0W:/root/.osmosisd/data/cs.wal/wal","Head":{"ID":"UL0yDN8TXC0W:/root/.osmosisd/data/cs.wal/wal","Path":"/root/.osmosisd/data/cs.wal/wal"},"Dir":"/root/.osmosisd/data/cs.wal"},"msg":{},"time":"2023-08-21T11:08:01Z","message":"service stop"}
{"level":"info","module":"p2p","peer":{"id":"0e6f924a3afe15996bc3f2612a9dd3bcf1c64212","ip":"222.106.187.14","port":53400},"impl":"Peer{MConn{222.106.187.14:53400} 0e6f924a3afe15996bc3f2612a9dd3bcf1c64212 out}","msg":{},"time":"2023-08-21T11:08:01Z","message":"service start"}
{"level":"info","module":"p2p","peer":{"id":"0e6f924a3afe15996bc3f2612a9dd3bcf1c64212","ip":"222.106.187.14","port":53400},"impl":"MConn{222.106.187.14:53400}","msg":{},"time"

https://github.com/mattverse/iavl/tree/mattverse/rwmutex-iavl This was the attempt I have made, I would be more than happy if you guys want to take it over from this point! lmk on the updates

tac0turtle commented 1 year ago

id say for the current iavl this pr is fine as is. In iavl 2.0 we will remove the fastnode system so its only present for a little. @cool-develope @yihuang thoughts?

cool-develope commented 1 year ago

I don't think the above error is related to RWMutex or sync.Map, it seems like trying parallel writing and breaking deterministic. But yeah I agree with the temporary solution if this PR resolves them.

yihuang commented 1 year ago

id say for the current iavl this pr is fine as is. In iavl 2.0 we will remove the fastnode system so its only present for a little. @cool-develope @yihuang thoughts?

no problem, since it's simpler.

tac0turtle commented 1 year ago

@mergifyio backport release/v1.x.x

mergify[bot] commented 1 year ago

backport release/v1.x.x

✅ Backports have been created

* [#825 feat: Support concurrency for IAVL and fix Racing conditions (backport #805)](https://github.com/cosmos/iavl/pull/825) has been created for branch `release/v1.x.x`
mattverse commented 1 year ago

@tac0turtle Can we get this in for sdk v0.50?

tac0turtle commented 1 year ago

yup already backported, we will test it on existing chains for a bit then cut the final release alongside the sdk eden release

mattverse commented 1 year ago

Sweet, should I also create subsequent PRs for old major versions so that its compatible with old sdk versions as well? It's not state breaking so should be patchable

tac0turtle commented 1 year ago

we would need to backport it to v0.20 for older versions. We can do this and bump the old ones.

mattverse commented 1 year ago

ok so the step would be first backport to v0.20 and then we'd be able to bump the older ones. Do you want me to handle merge conflicts for backporting it to v0.20?

mattverse commented 1 year ago

@tac0turtle wdym by "original"? The one without changes? If so, which version of sdk are you testing with? My node seemed fine

tac0turtle commented 1 year ago

sorry different issue. git bisect led to tthis first but it was wrong