rabbitmq / ra

A Raft implementation for Erlang and Elixir that strives to be efficient and make it easier to use multiple Raft clusters in a single system.
Other
813 stars 96 forks source link

Automatically recover from checksum failures in WAL #299

Open kjnilsson opened 2 years ago

kjnilsson commented 2 years ago

We could automatically recovery from checksum failures encountered during WAL recovery by filtering out any subsequent entries for any WriterId that has failed a check.

This will of course threaten consensus as log loss may result in a member being elected that causes committed entries to be thrown away so perhaps should either be a configuration setting that is enabled after encountering an error.