NVSL / linux-nova

NOVA is a log-structured file system designed for byte-addressable non-volatile memories, developed at the University of California, San Diego.
http://nvsl.ucsd.edu/index.php?path=projects/nova
Other
422 stars 117 forks source link

Inconsistent i_blocks before and after unmount after ftruncate #151

Open iaoing opened 6 months ago

iaoing commented 6 months ago

Issue

When fallocate a file, in some situations (e.g., fallocate to a larger size, fallocate after creating a snapshot), NOVA will allocate new data blocks and increment i_blocks, leading to i_blocks larger than the block number corresponding to the file size. After a umount and remount, when stat the file, NOVA scans the log and rebuilds the inode. When the rebuild is finished, i_blocks will be the number corresponding to the file size, which is different from the number before umount.

The truncate function does not this issue. The reason will be shown in the Reason part.

Reproduce

The case with snapshots.

insmod nova.ko metadata_csum=1 data_csum=1 data_parity=1 dram_struct_csum=1
mount -t NOVA -o init,dbgmask=255 /dev/pmem0 /mnt/pmem0
touch /mnt/pmem0/foo
echo 1 > /mnt/pmem0/foo
# the stat shows `i_blocks` is 8
stat /mnt/pmem0/foo
# create a snapshot
echo 1 > /proc/fs/NOVA/pmem0/create_snapshot
# fallcate the file with keep size option
fallocate -n -o 0 -l 1024 /mnt/pmem0/foo
# the stat shows `i_blocks` is 16
stat /mnt/pmem0/foo
# umount and remount
umount /mnt/pmem0
mount -t NOVA -o dbgmask=255 /dev/pmem0 /mnt/pmem0
# the stat shows `i_blocks` is 8
stat /mnt/pmem0/foo

The case without snapshots.

insmod nova.ko metadata_csum=1 data_csum=1 data_parity=1 dram_struct_csum=1
mount -t NOVA -o init,dbgmask=255 /dev/pmem0 /mnt/pmem0
touch /mnt/pmem0/foo
dd if=/dev/random of=/mnt/pmem0/foo bs=4096 count=1
# the stat shows `i_blocks` is 8
stat /mnt/pmem0/foo
# fallcate the file to 8192 with keep_size option
fallocate -n -o 4096 -l 4096 /mnt/pmem0/foo
# the stat shows `i_blocks` is 16
stat /mnt/pmem0/foo
# umount and remount
umount /mnt/pmem0
mount -t NOVA -o dbgmask=255 /dev/pmem0 /mnt/pmem0
# the stat shows `i_blocks` is 8
stat /mnt/pmem0/foo

Reason

After the rebuild of an inode, i_blocks will be set corresponding to the file size, as the below code snippet shows https://github.com/NVSL/linux-nova/blob/976a4d1f3d5282863b23aa834e02012167be6ee2/fs/nova/rebuild.c#L498-L501

As the below code snippet shows, when fallocate a file in some situations (e.g., increase size, different epochs), NOVA allocates a new data block (Line 266), increments the total blocks (Line 300), and resets the new i_blocks of sih and inode (Line 307). Next, NOVA updates the file tree (Line 322). https://github.com/NVSL/linux-nova/blob/976a4d1f3d5282863b23aa834e02012167be6ee2/fs/nova/file.c#L266-L326

If the fallcate operation is executed after a snapshot, the execution path of updating the file tree is: nova_reassign_file_tree -> nova_assign_write_entry -> nova_free_old_entry. In nova_free_old_entry, as the below code snippet shows, NOVA first invokes nova_append_data_to_snapshot (Line 156) and then nova_invalidate_write_entry (Line 159). Since the data block is snapshotted, NOVA will not free it. At Line 171, NOVA decrements sih->i_blocks. However, inode->i_blocks and sih->i_blcosk have been updated before updating the file tree. Either this update is incorrect or the previous update is incorrect. https://github.com/NVSL/linux-nova/blob/976a4d1f3d5282863b23aa834e02012167be6ee2/fs/nova/log.c#L155-L171

When truncating a file, as the below code shows, NOVA first updates the file tree, then updates inode->i_blocks and sih->i_blcosk. Thus, the number is consistently corresponding to the file size, regardless of snapshots. https://github.com/NVSL/linux-nova/blob/976a4d1f3d5282863b23aa834e02012167be6ee2/fs/nova/inode.c#L420-L426

Fix

I do not know what is the expected i_blocks after fallcate, the physically allocated blocks, or the file size.

Corresponding to the behaviors in the truncate function, updating inode->i_blocks and sih->i_blcosk after the updating of the file tree should be correct.

However, if fallocate a file to a larger size, NOVA does not free blocks in the nova_reassign_file_tree function, leading to a consistent physical i_blocks number. In recovery, i_blocks will be set as the file-sized number again. Thus, NOVA should have a consistent concept on i_blocks before patching.