Open pcworld opened 2 years ago
Not sure about the second case, but for the first case, if entry->updating
is set, during recovery in nova_rebuild_handle_write_entry()
we will reset the csum parity. Is there a way to reproduce it?
The following patch on top of 593f927a78a6900d7cfec58199fb0a4a4fd1d646 can be used to reproduce the first case (should also work on master):
diff --git a/fs/nova/log.c b/fs/nova/log.c
index 750ff47d9124..8c5dd218b715 100644
--- a/fs/nova/log.c
+++ b/fs/nova/log.c
@@ -869,7 +869,6 @@ int nova_set_write_entry_updating(struct super_block *sb,
unsigned long irq_flags = 0;
nova_memunlock_range(sb, entry, sizeof(*entry), &irq_flags);
entry->updating = set ? 1 : 0;
- nova_update_entry_csum(entry);
nova_update_alter_entry(sb, entry);
nova_memlock_range(sb, entry, sizeof(*entry), &irq_flags);
So it can be reproduced a bit simpler than I originally wrote.
With this patch applied, the following shell commands reproduce the error:
/ # set -ex
/ # mount -tNOVA -oinit /dev/pmem0 /mnt && echo -n test > /mnt/myfile && sync &&
echo -n testtesttesttesttesttest >> /mnt/myfile && echo success
+ mount -tNOVA -oinit /dev/pmem0 /mnt
nova: 1 cpus online
nova: nova_get_nvmm_info: dev pmem0, phys_addr 0x8000000, virt_addr 0xffff888008000000, size 10485760
nova: measure timing 0, metadata checksum 1, wprotect 0, data checksum 1, data parity 1, DRAM checksum 0
nova: Start NOVA snapshot cleaner thread.
nova: creating an empty nova of size 10485760
nova: NOVA initialization finish
nova: Current epoch id: 0
nova: Running snapshot cleaner thread
+ echo -n test
+ sync
+ echo -n testtesttesttesttesttest
nova error:
nova_verify_entry_csum: both entry and its replica fail checksum verification
nova error:
nova_verify_entry_csum: unable to repair entry errors
nova error:
nova_verify_entry_csum: both entry and its replica fail checksum verification
nova error:
nova_verify_entry_csum: unable to repair entry errors
nova: do_nova_inplace_file_write alloc blocks failed!, -22
/ # cat /mnt/myfile
+ cat /mnt/myfile
nova error:
nova_verify_entry_csum: both entry and its replica fail checksum verification
nova error:
nova_verify_entry_csum: unable to repair entry errors
cat: read error: Input/output error
This bug is probably fixed by PR #129. But maybe it still helps you find another issue if the recovery code isn't working as intended as you suggest (re "if entry->updating is set, during recovery in nova_rebuild_handle_write_entry() we will reset the csum parity").
PS, I intend to respond on the other issues I have reported where you have asked for clarification, might still take a while though.
The following further shows how the file can still be written to, and its contents can then be read, but after remounting the file system, the file becomes unreadable again (commands should be run after the shell commands from my previous comment):
/ # echo ho > /mnt/myfile
nova error:
nova_verify_entry_csum: both entry and its replica fail checksum verification
nova error:
nova_verify_entry_csum: unable to repair entry errors
/ # cat /mnt/myfile
ho
/ # umount /mnt
nova: Current epoch id: 0
nova: nova_save_inode_list_to_log: 1 inode nodes, pi head 0x375000, tail 0x375010
nova: nova_save_blocknode_mappings_to_log: 1 blocknodes, 1 log pages, pi head 0x376000, tail 0x376010
/ # mount -tNOVA /dev/pmem0 /mnt
nova: 1 cpus online
nova: nova_get_nvmm_info: dev pmem0, phys_addr 0x8000000, virt_addr 0xffff888008000000, size 10485760
nova: measure timing 0, metadata checksum 1, wprotect 0, data checksum 1, data parity 1, DRAM checksum 0
nova: Start NOVA snapshot cleaner thread.
nova: nova_init_inode_list_from_inode: 1 inode nodes
nova: Recovered 0 snapshots, latest epoch ID 0
nova: NOVA: Normal shutdown
nova: Current epoch id: 0
nova: Running snapshot cleaner thread
/ # cat /mnt/myfile
nova error:
nova_verify_entry_csum: both entry and its replica fail checksum verification
nova error:
nova_verify_entry_csum: unable to repair entry errors
I think this can be considered a bug in any case, even if the checksum persistence is fixed. If the file system is so broken that files cannot be read again after a remount, the corresponding write
calls should fail, or alternatively, recovery should be successful. This issue could not only result from a crash, but also from other kinds of data corruption.
This bug is probably fixed by PR #129.
I can not reproduce the I/O error anymore with PR #129. Leaving the issue open though for the reasons stated in my last two comments.
This report assumes that NOVA's protection features are enabled (
nova.metadata_csum=1 nova.data_csum=1 nova.data_parity=1
).For setup, I have created a small file:
echo -n test > /mnt/myfile && sync
In the following, I assume that a crash occurs during appending to the file (write
system call):echo -n testtesttesttesttesttest >> /mnt/myfile
I think there is a crash consistency bug that leads to the file becoming irrecoverably broken after recovery. The file will be empty and listed with size 0.
nova_verify_entry_csum
outputs the following error messages during mount and when listing(?) the directory: "both entry and its replica fail checksum verification" and "unable to repair entry errors". However, it appears that from user space perspective, the involved file operations do not return error codes, so an application reading the file assumes everything is in order and that the file would be empty. The file can still be written to, and its new contents are also readable, but after remounting the file system, it becomes empty again.I think one of the possibilites that leads to this state is the following crash scenario: When
data_csum || data_parity
is true,nova_set_write_entry_updating
is called indo_nova_inplace_file_write
. Innova_set_write_entry_updating
, ifmetadata_csum
is true, the following happens (excerpt): https://github.com/NVSL/linux-nova/blob/593f927a78a6900d7cfec58199fb0a4a4fd1d646/fs/nova/log.c#L871-L873 I consider the crash state whereentry->updating = 1
is persisted, but the checksum updated bynova_update_entry_csum(entry)
is not persisted (that function does not call a fence), and further, thealter_entry
updated bynova_update_alter_entry
is only partially persisted by crashing duringmemcpy_to_pmem_nocache
: https://github.com/NVSL/linux-nova/blob/593f927a78a6900d7cfec58199fb0a4a4fd1d646/fs/nova/checksum.c#L236 I assume that this leads to the crash state explained above.I think there's also a second crash that leads to the same bug, namely in
do_nova_inplace_file_write
(dax.c l. 744) ->nova_inplace_update_write_entry
(log.c l. 862) ->nova_inplace_update_log_entry
, where ifmetadata_csum
is set, the following is performed: https://github.com/NVSL/linux-nova/blob/593f927a78a6900d7cfec58199fb0a4a4fd1d646/fs/nova/log.c#L473-L480 When one followsnova_update_log_entry
tonova_update_write_entry
and there, the checksum update is not persisted, and again if there's a crash in the middle of copying thealter_entry
updated bynova_update_alter_entry
(this time called from thenova_inplace_update_log_entry
snippet above), I think this leads to the same or similar crash state as outlined above.Maybe making
nova_update_entry_csum
perform a fence at the end would fix these bugs, but I'm not sure if this helps in every case, and also I'm not sure if that would have serious performance implications. I will test if adding that fence will at least fix the crash paths that I have observed so far leading to this bug.I can imagine that my report is probably difficult to follow, I could try to post a repro in the future, or upload a crash image that exhibits this bug.