0chain / 0chain

Züs (formerly 0Chain) is a decentralized blockchain-based storage platform with no vendor lock-in and a 3-layer security - fragmentation, proxy re-encryption, and immutability. It has close to wire speed data performance, free reads, and is ideal for apps as well as backups, AI data, disaster recovery.
https://zus.network
Other
118 stars 52 forks source link

Remove or optimize `chain.stateMutex` #1002

Open peterlimg opened 2 years ago

peterlimg commented 2 years ago

https://0chain.slack.com/archives/C01V27E2Z9N/p1643292284000600

@dabasov :

The thread for problem in discussion: there were a lot of kick sharder (it was caused due to restart I think) messages sent and sharder started to push tasks for round finalizations to 302441, after several manipulations finally block 302437 was sent for finalizing. Finalizing went well, until it stuck on rebaseState. It was caused by lock on chain.stateMutex It’s not obvious for me what this lock is used for, it is pretty heavy, since it is locked when state is computed, it might be locked for a long time during long state compute and halt finalization completely. The only reason for using it to guard MPT’s SetNodeDB but is is already guarder with mpt.mutex which ensures data visibility of c.stateDB I don’t think we need locking even in UpateState since it is guarder by worker and other mutexes and might be excessive. Well, since it is not dead lock and hightps-2 could recover after this bug somehow I decided to discuss it here at first with @Peter It can be painful to remove locking and should be done accurately. Finalization was started several times for this block, but it was stuck too. I think we do not need to start it so aggressively (for forceFinalizeRound) and use similar to finalization technic and check whether this block is being finalized already.

ma2b0043 commented 2 years ago

@peterlimg when can we expect to close this issue?