Closed kjnilsson closed 1 year ago
To do this well the ra_snapshot
behaviour API could do with some modifications. ra_snapshot
isn't formally part of the public API of Ra so we should be allowed to make modifications without breaking anyones code. The only use on GH I found is in mnevis which is an abandoned project of our own control.
Option 3 seems promising, but is it future-proof in case term_to_binary()
changes again?
I think it is — a snapshot file is generated once and for all as it contains the result of commands which were or will be dropped from the WAL in the given term — but I'm not confident enough in my deep understanding of Ra.
This is due to the fact that the snapshot meta data map isn't replicated as binary data but instead re-serialised on the receiver which means it may not calculate the same checksum over that data (as term_to_binary map representation isn't deterministic between OTP versions). As the snapshot replication includes binary data from files we really ought to try to perform checksum validation if at all possible.
Options:
ra_log_snapshot
(this won't work when sending a snapshot to an old member that still performs the validation).<<"RASN", Version:32/integer>>
). If so the sender also has the new code and all is well. If not included the sender has the old logic and the receiver falls back to the old behaviour of serialising the meta data map but without performing checksum validation. Instead we have to assume the data is fine and calculate and write a new checksum value into the snapshot file.