hashgraph / hedera-services

Crypto, token, consensus, file, and smart contract services for the Hedera public ledger
Apache License 2.0
281 stars 124 forks source link

Restarting node hangs on PCES replay #10841

Closed litt3 closed 8 months ago

litt3 commented 8 months ago
litt3 commented 8 months ago

I've confirmed that the root cause is the concurrent scheduler getting overloaded. If you try to replay many events (>10k), replay fails in the way observed.

I believe that this root cause will be solved by https://github.com/hashgraph/hedera-services/issues/10872

As a stopgap measure, I am going to make the hasher use a DIRECT scheduler. Since it is currently only used for PCES replay, this is acceptable. It will be turned back into a CONCURRENT scheduler when the more permanent fix is ready

litt3 commented 8 months ago

Temporary fix merged, final fix will be https://github.com/hashgraph/hedera-services/issues/10872