etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.91k stars 9.78k forks source link

TLA+ spec for raft consensus algorithm in etcd implementation #17004

Open joshuazh-x opened 1 year ago

joshuazh-x commented 1 year ago

What would you like to be added?

I'd like to add a TLA+ spec for raft consensus algorithm that etcd implements. The PR and issue are posted in etcd-io/raft. I'd like to link it here to get more feedback from etcd side, as this is relevant to my following-up work on model-based trace validation which I hope can contribute to etcd validation.

Why is this needed?

There have been multiple formal specifications of raft concensus algorithm in TLA+, following Diego Ongaro's Ph.D. dissertation. However, I have not seen a version that aligns to the raft library implemented in etcd-io/raft, which we know that are different to the original raft algorithm in some behaviors, i.e. reconfiguration.

etcd and many other applications based on this raft library have been running good for long time, but I feel it would still be worthy to write a TLA+ spec for this specific implementation. It is not just to verify the correctness of the model, but also a foundation of a following up work in model-based trace validation.

Currently only limited state transitions are included (log replication, leader election, reconfiguration). In the future, we may expand the spec to include more functional features such as lease, linearizability, etc.

joshuazh-x commented 1 year ago

PR: https://github.com/etcd-io/raft/pull/112 Issue: https://github.com/etcd-io/raft/issues/111

stale[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.