I've found a potential crash consistency bug in WineFS that occurs during a write() call. We found the same issue in PMFS (see here: https://github.com/NVSL/PMFS-new/issues/8) and it appears to be present in WineFS as well.
The issue is pretty much exactly the same as the linked PMFS report, but I'll describe it again here. I found it in WineFS's strict mode. Basically, when WineFS uses pmfs_xip_cow_file_write() to perform a write to a page that has already been allocated to a file, it uses pmfs_file_write_fast() to perform the data write and update the file's size and last-modified time. There is a call to pmfs_flush_buffer() at the end of pmfs_file_write_fast() to flush the size and time updates, but it is not followed by a store fence. So, it's possible for the write() call to complete and for another system call to begin and potentially make some new durable writes before the updated file size is durable. Ensuring that there is a fence after this flush fixes the problem.
Would this be considered a bug in WineFS? It seems to violate the synchronous system call guarantee made by the strict mode.
Hi Rohan,
I've found a potential crash consistency bug in WineFS that occurs during a
write()
call. We found the same issue in PMFS (see here: https://github.com/NVSL/PMFS-new/issues/8) and it appears to be present in WineFS as well.The issue is pretty much exactly the same as the linked PMFS report, but I'll describe it again here. I found it in WineFS's
strict
mode. Basically, when WineFS usespmfs_xip_cow_file_write()
to perform a write to a page that has already been allocated to a file, it usespmfs_file_write_fast()
to perform the data write and update the file's size and last-modified time. There is a call topmfs_flush_buffer()
at the end ofpmfs_file_write_fast()
to flush the size and time updates, but it is not followed by a store fence. So, it's possible for thewrite()
call to complete and for another system call to begin and potentially make some new durable writes before the updated file size is durable. Ensuring that there is a fence after this flush fixes the problem.Would this be considered a bug in WineFS? It seems to violate the synchronous system call guarantee made by the
strict
mode.Thanks!