Open hayley-leblanc opened 2 years ago
Adding a higher-level overview to explain why I think this behavior is incorrect: my understanding of WineFS's strict
mode is that if the system crashes before the transaction in pmfs_xip_file_write()
is committed, the entire data write should be rolled back during recovery. In the provided steps, we are emulating a crash before this transaction's commit block is written during the second write()
, so I expect the contents of the 1024-byte write to not be present in the crash state.
I initially discovered this bug using our crash consistency testing tool, which constructs some crash states in which only a portion of the written data is persisted before a crash. I'm seeing consistency checks on these tests fail because the partial write is present after the crash. However this is harder to make happen without the tool than just injecting a crash between the full data write and the transaction commit :)
Thanks! This is indeed a crash consistency bug that breaks the atomicity of writes in the strict mode of WineFS. In strict mode, the root block of the file is not being copy-on-written on a file update when the size of the file is <= 4KB, causing write atomicity issues in the strict mode. The fix for the issue is being handled in #6 .
Hi Rohan,
I think I've found a situation where writes may not occur atomically with respect to crashes even in strict mode. Here are the steps to reproduce the issue:
This will emulate a crash occuring after calling
__pmfs_xip_file_write()
but before committing the transaction when performing a write of 1024 bytes. The bug requires two writes to manifest, so making it conditional on the write size will make sure we don't emulate the crash too early.mount -t winefs -o init,strict /dev/pmem0 /mnt/pmem
.dd
to copy out the contents of /dev/pmem0 to a separate file, unmount WineFS, recopy the contents of the file, and remount. This ensures that we go through recovery code.After doing these steps, when I do
cat /mnt/pmem/file0
, I see that the first 1024 bytes have been overwritten with 'b'. This seems like incorrect behavior, since WineFS is being used in strict mode and the transaction for the write was not committed before the crash. I would expect the file to still be all 'a's.Let me know what you think. Thanks!