Closed relyt29 closed 1 month ago
I would like to know more about the the potential non-determinism here.
In this specific case, the parallelization was added because we are dealing with data structures of different topics, so there's no risk of race conditions afaik.
non determinism because the order of operations is now non-deterministic as the different threads will interrupt and be ordered differently in the scheduler on different computers
according to my understanding the order of writes into the IAVL tree on disk does produce different trees for different orders of insertions, even if the data is the same
we have not seen evidence of this being a real issue on our testnets, but I advocate for removing the (extremely convoluted, hard to read) parallelization code anyways
What is the purpose of the change
This change simplifies a hard-to-read and hard-to-reason about piece of code that parallelized the picking of churn ready topics from the active set. After this PR the code is easier to read and understand, and has less likelihood of causing non-determinism issues. The tradeoff is that the code may run slightly slower in serial.
Testing and Verifying
This change is not trivial, however it is a small diff / refactor PR and our existing test cases should be able to cover the change.
Documentation and Release Note
This PR does not have any implications for documentation.