Closed martinsumner closed 4 years ago
@aarongibbon
Also, should the file contents be CRC checked? Should we be able to rebuild a manifest from the state of the Journal files?
This is the standard write/rename dance used by Riak:
Just adding the sync
flag to file:write_file/3
would help - but that isn't available in R16.
The leveled_imanifest does not read the file back after writing it, before switching it to be the active file. This means that a poorly timed crash can leave an empty active manifest file:
https://github.com/martinsumner/leveled/blob/f907fb5c97c7f9bee4af5d2bd975b1176992f842/src/leveled_imanifest.erl#L166-L177
The rename accepts the
ok
as proof that the file is written. It should check by reading the file.This can then lead to this on restart:
This can be resolved by removing the empty file.
However, should we have greater protection against this (e.g. read before rename)? Should the handling of corruption be better automated?