parallelchain-io / hotstuff_rs

Rust implementation of the HotStuff consensus algorithm.
38 stars 5 forks source link

Integration tests sometimes never completes because Number App transactions are dropped #45

Open lyulka opened 3 months ago

lyulka commented 3 months ago

Affected version

HotStuff-rs v0.4, particularly PR #36

Observations

@ZUOYANGDING observed that the progress_and_validator_set_updates test case sometimes never completes. When this happens, the replicas continue successfully committing empty blocks at higher and higher views, but their number never gets up to 4 and therefore the final while loop check is never exited.

The probability that progress_and_validator_set_updates gets into this "stuck" state seems to depend on the particulars of the system that the tests are being run on. So for example, it matters whether the test cases are being run on macOS, or in Ubuntu. Zuoyang notes that only progress_and_validator_set_updates fail to complete, the other test cases always complete.

Analysis

We suspect that this issue is caused because sometimes Number App Increment transactions are dropped, so number never gets incremented up to 4. We suspect that executing a validator set update may increase the probability of blocks being pruned for a short period of time, increasing the probability that a transaction is dropped.

Proposed solution

Redesign the test suite to have test cases that do not fail just because transactions are dropped (the bottom line is that HotStuff SMR does not guarantee that transactions are not dropped, and in fact, transactions are not a native concept).