utsaslab / WineFS

WineFS (SOSP 21): a huge-page aware file system for persistent memory
34 stars 2 forks source link

data lost due to pmfs_evict_inode is not atomic #26

Open iaoing opened 2 months ago

iaoing commented 2 months ago

Bug

This is a concurrency and crash-consistency bug.

If VFS issues pmfs_evict_inode while another process creates a file or a directory, the newly created file (dir) can be allocated an inode number that is still in the truncate_list. Giving a crash after the creation and before pmfs_truncate_del, during recovery, the file size of the newly created file (dir) will be altered when traversing truncate_list.

A similar bug could occur in PMFS since PMFS and WineFS have the same truncate list functionalities.

Reproduce

First, modify the source code in https://github.com/utsaslab/WineFS/blob/b4017d0fa5fd2b526e870b0338c311829e5f4464/Linux-5.1/fs/winefs/inode.c#L1672-L1679 as below shows.

if (destroy == 0) {
    pmfs_dbg_verbose("%s: destroying %lu\n", __func__, inode->i_ino);
    pmfs_free_dram_resource(sb, sih);
}
pmfs_dbg("start sleep 50 seconds");
msleep(50000);
pmfs_dbg("end sleep 50 seconds");
/* now it is safe to remove the inode from the truncate list */
pmfs_truncate_del(inode);

Run the below commands according to the comments.

# mount fs
insmod winefs
mount -t winefs -o init,dbgmask=255 /dev/pmem0 /mnt/pmem0

touch /mnt/pmem0/foo # terminal 1
rm /mnt/pmem0/foo # terminal 1, this command will take 50 seconds due to the modification of the source code.

mkdir /mnt/pmem0/dir # execute in terminal 2 during rm
cat /dev/pmem0 > img.1 # terminal 2, save the PM image to simulate a crash (should be done before rm foo is done)

# wait until all commands are done

# syslog shows the foo (touch foo) got the inode 33. 
# During pmfs_evict_inode (rm foo), the inode number, 33, is freed. 
# Then, dir got the inode number 33 (mkdir dir). 
# However, when allocating 33 to dir, the function `pmfs_truncate_del` has not been invoked, which means 33 is still in the truncate list.
# Therefore, we have an image that: (a) inode 33 is still in the truncate list (in PM); (b) inode 33 is allocated for 'dir'.
dmesg 

# umount fs 
umount /mnt/pmem0
rmmod winefs
insmod winefs

# recover the img
dd if=img.1 of=/dev/pmem0 bs=1048576 count=128 # size according to the dev size
mount -t winefs -o dbgmask=255 /dev/pmem0 /mnt/pmem0

# The below command will show nothing in dir directory. 
# This is because the file size of dir has been reset as 0 when recovering the truncate list, which contains the inode 33.
ls -a /mnt/pmem0/dir

Fix

Do not have a good idea so far.