Open hayley-leblanc opened 3 years ago
I spent some time digging into this issue and have a bit more information. First, I think the specific error output I reported above might stem from the same issues described in #126. However, I think there may be a separate issue here impacting crash consistency in truncate.
In order to see this with the example program described above, I added print statements to nova_update_stripe_csum()
so that the checksum calculated during the write()
is printed. On my machine, that checksum is 0xbd6f81f8. When I inject the crash and remount the file system, stat /mnt/pmem/file0
gives
File: /mnt/pmem/file0
Size: 1 Blocks: 0 IO Block: 4096 regular file
Device: 10300h/66304d Inode: 33 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-12-03 22:36:05.000000000 +0000
Modify: 2021-12-03 22:36:05.000000000 +0000
Change: 2021-12-03 22:36:05.000000000 +0000
Birth: -
i.e., the truncate has at least partially gone through because the size of the file has been updated. As shown above, cat /mnt/pmem/file0
shows that file0's calculated checksum is now 0x17615e49, but the stored checksum is still 0xbd6f81f8, which is why the checksum verification fails. It appears that the truncation operation is not atomic with the checksum updates, which causes the error here and makes the truncated file unreadable. I also tried to figure out where this bug STOPS occurring (i.e., is there a point in truncate()
after which crashes do not cause this issue) and it looks like the issue is resolved when nova_update_truncated_block_csum()
gets called (by nova_clear_last_page tail()
, which is called by nova_setsize()
) - i.e, when file0's data checksums are brought into line with its other modifications.
Unfortunately, I don't have a fix or workaround for this issue; I think it could be pretty tricky to fix. The operations that would need to become atomic span multiple function calls (they start in nova_handle_setattr_operation()
and finish in nova_setsize()
and there doesn't appear to already be a truncate transaction that they could be added to.
Thanks for the report. Is there an easy way to reproduce it? The program you use, etc. Is the dd
step required in step 4 and 5?
I went back and tested it out and you don't need dd
(although dd
is a kernel copying utility that should be installed by default - sorry for the lack of clarity on that part). The bug should occur if you just add the goto
from step 1 to emulate a crash, mount NOVA in data_csum
mode, run the program described in step 3, and then try to read the file.
Here is the program I am using in step 3 to make this bug manifest: test4.zip. It creates a file called file0 on NOVA, and trying to read it after following steps 1-3 should give the checksum verification error.
I cannot reproduce the error with test4.cpp. After umount and mount, cat the file shows "a", without errors in dmesg. Is this related to VM setup? Can you reproduce the issue on a bare-metal machine?
I looked into this a bit more (although it still needs some more investigation). Like in #126, I was not able to reproduce the issue on baremetal and I was able to resolve it on a VM by using QEMU's -cpu host
flag. I haven't had a chance to look at exactly what is different from the original VM setup vs. baremetal/with -cpu host
, but I assume it's something similar to what I observed for the issue in #126. I can spend some more time looking into it if you consider this a real bug.
Thanks. I have seen that people encountering issues with NOVA on VM, and some flags help to workaround. I am not sure what the issue really is, but if you find out I am happy to apply.
Hi Andiry,
I believe I've found a crash consistency issue with the
truncate()
system call in NOVA's data_csum mode. It can be replicated using the following steps:goto out;
around line 1330 of inode.c (right afternova_handle_setattr_operation()
innova_notify_change()
). This emulates a crash that preventsnova_setsize()
from running.data_csum=1
and mount it at /mnt/pmemdd
to copy the contents of the PM device to a separate filedd
to load the contents of the separate file back onto the PM devicecat /mnt/pmem/foo
The attempt to read foo gives an input/output error and NOVA outputs the following error logs:
As far as I have been able to tell, this issue seems to occur if we crash at any point after updating the tail pointer in
nova_update_inode()
(called bynova_handle_setattr_operation()
) and before handling checksums innova_update_truncated_block_csum()
. I don't know the exact root cause or have a fix for this, but I'll take another look when I get a chance.Thanks!