Closed hayley-leblanc closed 2 years ago
The issue is nova_rename()
will append a link change entry to the log. nova_link()
after that, will modify the existing link change entry instead of appending a new one, because there is no snapshot taken. So, a crash after nova_append_link_change_entry()
will update the link change entry's links to 2.
Fixed.
Hi,
I believe NOVA may have a crash consistency bug that arises in a specific scenario involving rename and link operations. Suppose we perform the following set of operations on an empty NOVA file system in the default configuration:
If we crash during the
link
operation, it's possible for the linkpathbar
to not be present, but for the targetA/bar
to have a link count of 2. I believe this should be considered a crash consistency bug, since it reveals intermediate state to the user after the crash, and NOVA is meant to update metadata atomically.It should be possible to reproduce the bug by adding the line
goto out;
after the call tonova_append_link_change_entry()
on line 393 in namei.c, mounting a fresh NOVA instance at /mnt/pmem, running the following commands:If you then unmount and remount the file system, /mnt/pmem/bar is not present, but
stat
-ing /mnt/pmem/A/bar gives the following output:It seems that this issue may also have implications for other processes/threads trying to read the link concurrently with the the process creating the link; with the simulated crash using
goto out
, if you stat /mnt/pmem/A/bar after the link, the link count has been increased but /mnt/pmem/bar is not present. This part of the issue also appears to be present even if therename
operation does not take place, but it seems that without therename
operation in the workload, the problem is resolved during recovery and the link count is set correctly.Unfortunately I don't know exactly what the cause of this issue is at the moment and I don't have a fix, but I'll submit a PR if I figure it out.