NVSL / linux-nova

NOVA is a log-structured file system designed for byte-addressable non-volatile memories, developed at the University of California, San Diego.
http://nvsl.ucsd.edu/index.php?path=projects/nova
Other
421 stars 118 forks source link

Possible mkdir crash consistency bugs #91

Closed hayley-leblanc closed 3 years ago

hayley-leblanc commented 3 years ago

Hi NOVA folks,

I believe I have found a few crash consistency bugs in NOVA. They are all related to mkdir(). These were found by recording CLWB/CLFLUSHs, SFENCEs, and non-temporal stores, then re-creating the state of the file system immediately after a crash.

Let's say we create a new directory named A/, fsync A/, and then crash.

  1. The log_head field for the in-NVM inode for the A/ is never explicitly persisted, which can leave the inode with a null log head pointer. Attempts to read the parent directory or the new directory in the recreated crash state get nova error: Dir 34 log is NULL! messages. Also, NOVA allows the creation of new files within A/ with no error, but those files are then inaccessible. I believe this bug can be fixed by adding a call to nova_flush_buffer(&(pi->log_head), CACHELINE_SIZE, 0) to nova_append_dir_entries() in dir.c.

  2. This crash can also potentially cause subsequent attempts to create files to fail. It seems as though NOVA attempts to assign new files the same inode number as was given to A/, which causes the file creation to fail. Attempting to create a new file in A/ after the crash gives the following error:

    [   80.994643] nova error: 
    [   80.994644] nova_new_inode failed ino 22
    [   80.994972] nova error: 
    [   80.994973] nova_create return 0

    My testing tool doesn't find this bug anymore if I add nova_flush_buffer(pi, sizeof(struct nova_inode), 0) in nova_free_inode_log() at the end of the if (pi) if statement.

  3. A/ can also be inaccessible and undeletable if the write to its inode's valid field in nova_lite_transaction_for_new_inode() doesn't make it to NVM by the time the crash occurs. Running ls -l in A's parent directory in the recreated crash state gives the following output:

    [   61.442550] nova error: 
    ls: cannot access 'A': Input/output error
    total 0
    d????????? ? ? ? ?            ? A

    and dmesg gives the following errors:

    [   61.442526] nova: nova_rebuild_inode: inode 34 has been deleted.
    [   61.442536] nova: nova_iget: failed to rebuild inode 34
    [   61.442550] nova error: 
    [   61.442553] nova_lookup: get inode failed: 34

    Attempting to delete A also fails. I think this bug can be fixed by adding nova_flush_buffer(&pi->valid, 1, 0) in nova_lite_transaction_for_new_inode(), immediately after setting pi->valid = 1.

I'm happy to make a PR with my suggested additions, or a subset of them, if y'all would like. They fix these three issues in the specific mkdir; fsync; crash test case outlined above, but that's the only scenario in which I've tested them.

Thanks!

Andiry commented 3 years ago

Hi Hayley,

Thanks for the findings!

1), I assume you are referring to nova_append_dir_init_entries()? There used to be a nova_flush_buffer(). However, when implementing NOVA-fortis, I guess it returns early when metadata_csum==0. So a nova_flush_buffer() is still needed.

2) Yes you are right.

3) I think the logic is:

nova_create_inode_transaction(); // journal pi->valid pi->valid = 1; nova_commit_lite_transaction();

So if crash occurs before the transaction commits, it should be restored to 0. However, if the transaction succeeds, pi->valid is not flushed. Does it comply with your analysis? Also, for nova_flush_buffer() 1 byte should work as expected, but in other places we usually flushes CACHELINE_SIZE.

I really appreciate your help and expect your PR.

hayley-leblanc commented 3 years ago

Thanks! For 1), yes, the bug is in nova_append_dir_init_entries(). On 3) - yes, that matches with my findings.

I'll submit a PR shortly.