Open pcworld opened 2 years ago
Can you provide the repro steps? Thanks.
Use the following shell script to reproduce. It creates a symlink, dumps the pmem image and then zeroes a single byte of the symlink target path.
set -ex
mount -tNOVA -oinit /dev/pmem0 /mnt
echo -n test > /mnt/myfile
ln -s /mnt/myfile /mnt/symlink
ls -l /mnt
sync
cat /dev/pmem0 > bak
umount /mnt
cp bak corrupted
index=`strings -t d bak | grep /mnt/myfile | cut -f1,1 -d\ ` && echo "$index"
dd if=/dev/zero of=corrupted bs=1 seek=$((index+10)) count=1 conv=notrunc
cat corrupted > /dev/pmem0
mount -tNOVA -oro /dev/pmem0 /mnt
ls -l /mnt
Output:
+ mount -tNOVA -oinit /dev/pmem0 /mnt
nova: 1 cpus online
nova: nova_get_nvmm_info: dev pmem0, phys_addr 0x8000000, virt_addr 0xffff888008000000, size 10485760
nova: measure timing 0, metadata checksum 1, wprotect 0, data checksum 1, data parity 1, DRAM checksum 0
nova: Start NOVA snapshot cleaner thread.
nova: creating an empty nova of size 10485760
nova: NOVA initialization finish
nova: Current epoch id: 0
nova: Running snapshot cleaner thread
+ echo -n test
+ ln -s /mnt/myfile /mnt/symlink
+ ls -l /mnt
total 4
-rw-r--r-- 1 0 0 4 Dec 28 17:56 myfile
lrwxrwxrwx 1 0 0 11 Dec 28 17:56 symlink -> /mnt/myfile
+ sync
+ cat /dev/pmem0
+ umount /mnt
nova: Current epoch id: 0
nova: nova_save_inode_list_to_log: 1 inode nodes, pi head 0x376000, tail 0x376010
nova: nova_save_blocknode_mappings_to_log: 1 blocknodes, 1 log pages, pi head 0x377000, tail 0x377010
+ cp bak corrupted
+ strings -t d bak
+ grep /mnt/myfile
+ cut -f1,1 -d
+ index=8286208
+ echo 8286208
8286208
+ dd if=/dev/zero of=corrupted bs=1 seek=8286218 count=1 conv=notrunc
1+0 records in
1+0 records out
1 bytes (1B) copied, 0.000846 seconds, 1.2KB/s
+ cat corrupted
+ mount -tNOVA -oro /dev/pmem0 /mnt
nova: 1 cpus online
nova: nova_get_nvmm_info: dev pmem0, phys_addr 0x8000000, virt_addr 0xffff888008000000, size 10485760
nova: measure timing 0, metadata checksum 1, wprotect 0, data checksum 1, data parity 1, DRAM checksum 0
nova: Start NOVA snapshot cleaner thread.
nova: NOVA: Failure recovery
nova: Recovered 0 snapshots, latest epoch ID 0
nova: Running snapshot cleaner thread
nova: Failure recovery total recovered 3
nova: Current epoch id: 0
+ ls -l /mnt
total 9
-rw-r--r-- 1 0 0 4 Dec 28 17:56 myfile
lrwxrwxrwx 1 0 0 12 Dec 28 17:56 symlink -> /mnt/myfil
At the end, "symlink" points to /mnt/myfil
(rather than /mnt/myfile
), but NOVA does not detect any checksum errors (and does not try to repair anything).
Thanks for the repo steps. Looks like symlink has protection for write entry, but no protection for the data (symlink names). So data csum/parity does not work for symlinks.
When creating a symlink to target
/mnt/myfile
, multiple crash states can be observed after the symlink operation has completed: The targets/mnt/myfile
,/mnt/myfil
,/mnt/myfi
and/mnt/myf
are possible. This originates from the same bug as #105 (asnova_block_symlink
usesmemcpy_to_pmem_nocache
).However what is interesting, is that these failure states are even observed with data parity turned on (
nova.metadata_csum=1 nova.data_csum=1 nova.data_parity=1
). The crash states in #105 (where a normal file and not a symlink is written) do not occur with these protection features, as NOVA is able to detect and recover the unpersisted bytes. However, it appears that symlink targets are not protected by checksum or parity features, so when recovering from such a crash, NOVA does not detect nor repair the symlink targets.