### Tasks
- [ ] try to reuse existing safekeeper parts more
- [ ] try to make `walproposer_sim` simpler
- [ ] run perf to find bottlenecks and speed up testing
- [x] reproduce and fix voting bug (https://gist.github.com/petuhovskiy/2230a8ec749cbee26d15640f1233c3e4)
Post #5804 review, I think we need to (roughly in the order of priority):
Fix commit_lsn assertion. Easy fix would be to track it per safekeeper and ensure it doesn't go down. But ideally, we should reconstruct from logs (or just assert at every step like in TLA+) committed WAL at any node and ensure that records from it don't dissapear/change.
Definitely need 2 walproposers (maybe 3), can also run a bit with 5 sks for fun.
Need some tweaking of options. It is important to have everything derived from consts and seed, but reasonable fraction of schedules should be able to commit at least something, and this should be easily observable (currently can grep debug log for commit_lsn; entries and compare them). It should be also easy to see all params in single place because relationship between them is very important (i.e. network delay vs injected events interval).
things from your list, especially repro/fix voting bug :)
Motivation
https://github.com/neondatabase/neon/pull/5804 should get merged soon. It has the first revision of simulation testing. This epic will contain possible ideas and follow-ups for it.
Implementation ideas
Other related tasks and Epics