NVSL / linux-nova

NOVA is a log-structured file system designed for byte-addressable non-volatile memories, developed at the University of California, San Diego.
http://nvsl.ucsd.edu/index.php?path=projects/nova
Other
421 stars 117 forks source link

XFSTests generic/003: failures with data protection enabled #34

Closed stevenjswanson closed 7 years ago

stevenjswanson commented 7 years ago
#    sudo modprobe nova measure_timing=0 \
     inplace_data_updates=0 \
     wprotect=0 mmap_cow=1 \
     unsafe_metadata=0 \
     replica_metadata=1 metadata_csum=1 dram_struct_csum=1 \
     data_csum=1 data_parity=1
[ 1667.446838] nova: nova_check_inode_integrity: inode 34 checksum error, trying to repair using the replica
[ 1667.446840] nova: nova_repair_inode: inode 34 error repaired
[ 1667.446842] nova: nova_get_entry_copy: unknown or unsupported entry type (0) for checksum, 0xffff8c4687bd4000
[ 1667.446844] CPU: 1 PID: 30790 Comm: rm Tainted: G    B      OE   4.10.0-nova #8
[ 1667.446845] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 1667.446845] Call Trace:
[ 1667.446850]  dump_stack+0x63/0x81
[ 1667.446856]  nova_get_entry_copy.isra.7+0x25d/0x330 [nova]
[ 1667.446859]  nova_verify_entry_csum+0x88/0x440 [nova]
[ 1667.446862]  ? nova_verify_entry_csum+0x297/0x440 [nova]
[ 1667.446863]  ? put_dec+0x18/0xa0
[ 1667.446864]  ? number+0x2ed/0x300
[ 1667.446868]  nova_rebuild_file_inode_tree+0xed/0x5e0 [nova]
[ 1667.446870]  ? printk+0x57/0x73
[ 1667.446874]  ? nova_repair_inode+0xbf/0xe1 [nova]
[ 1667.446877]  ? nova_check_inode_integrity+0x53d/0x620 [nova]
[ 1667.446880]  nova_rebuild_inode+0x15a/0x210 [nova]
[ 1667.446883]  nova_iget+0xa7/0x190 [nova]
[ 1667.446886]  nova_lookup+0xd6/0x1a0 [nova]
[ 1667.446887]  ? legitimize_path.isra.27+0x2e/0x60
[ 1667.446889]  lookup_slow+0xa5/0x160
[ 1667.446890]  walk_component+0x1bf/0x350
[ 1667.446891]  path_lookupat+0x4b/0x100
[ 1667.446892]  filename_lookup+0xb1/0x180
[ 1667.446894]  ? mem_cgroup_commit_charge+0x7e/0x510
[ 1667.446895]  ? __check_object_size+0x100/0x1d7
[ 1667.446897]  ? strncpy_from_user+0x4d/0x170
[ 1667.446898]  user_path_at_empty+0x36/0x40
[ 1667.446899]  vfs_fstatat+0x66/0xc0
[ 1667.446900]  SYSC_newfstatat+0x24/0x60
[ 1667.446901]  ? __do_page_fault+0x2ab/0x520
[ 1667.446902]  SyS_newfstatat+0xe/0x10
[ 1667.446904]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 1667.446905] RIP: 0033:0x7f3edeca43cb
[ 1667.446905] RSP: 002b:00007fff0cc22918 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[ 1667.446907] RAX: ffffffffffffffda RBX: 000055c2aaaba0e0 RCX: 00007f3edeca43cb
[ 1667.446907] RDX: 000055c2aaabb398 RSI: 000055c2aaabb428 RDI: ffffffffffffff9c
[ 1667.446908] RBP: 00007f3edef6db00 R08: 0000000000000100 R09: 0000000000000130
[ 1667.446908] R10: 0000000000000100 R11: 0000000000000246 R12: 00007f3edef6db58
[ 1667.446908] R13: 00007fff0cc22bf8 R14: 0000000000002710 R15: 0000000000001110
[ 1667.446911] nova: nova_repair_entry_pr: entry media error repaired
[ 1667.446952] nova error:
[ 1667.446953] File inode 34 log is NULL!
[ 1667.446973] ------------[ cut here ]------------
[ 1667.448047] kernel BUG at fs/nova/rebuild.c:426!
[ 1667.449102] invalid opcode: 0000 [#1] SMP
[ 1667.449995] Modules linked in: nova(OE) libcrc32c rfcomm coretemp crct10dif_pclmul vmw_balloon crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_rapl_perf nd_pmem snd_ens1371 gameport snd_ac97_codec ac97_bus dax_pmem dax nd_btt snd_pcm joydev input_leds serio_raw snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_timer videobuf2_core snd_seq_device videodev media snd soundcore bnep vmw_vsock_vmci_transport vsock vmw_vmci shpchp i2c_piix4 nfit mac_hid btusb btrtl btbcm btintel bluetooth parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid psmouse mptspi scsi_transport_spi mptscsih vmwgfx e1000 drm_kms_helper syscopyarea sysfillrect ahci sysimgblt fb_sys_fops libahci
[ 1667.463739]  ttm mptbase drm pata_acpi fjes [last unloaded: nova]
[ 1667.464871] CPU: 1 PID: 30790 Comm: rm Tainted: G    B      OE   4.10.0-nova #8
[ 1667.466222] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 1667.468194] task: ffff8c46dcb9ad00 task.stack: ffffba895151c000
[ 1667.469299] RIP: 0010:nova_rebuild_file_inode_tree+0x5a0/0x5e0 [nova]
[ 1667.470493] RSP: 0018:ffffba895151f8c0 EFLAGS: 00010286
[ 1667.471466] RAX: 0000000000000000 RBX: ffff8c46f4496100 RCX: 0000000000000006
[ 1667.472782] RDX: 0000000000000007 RSI: 0000000000000247 RDI: ffff8c46f724dc80
[ 1667.474116] RBP: ffffba895151faa8 R08: 0000000000000001 R09: 000000000060e942
[ 1667.475472] R10: 0000000000000004 R11: 0000000000000000 R12: ffff8c46e6b04800
[ 1667.476797] R13: ffff8c4687bd4400 R14: ffffba895151f948 R15: 0000000000000000
[ 1667.478601] FS:  00007f3edf179700(0000) GS:ffff8c46f7240000(0000) knlGS:0000000000000000
[ 1667.480122] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1667.481196] CR2: 000055c2aaabb1f8 CR3: 000000023e66f000 CR4: 00000000003406e0
[ 1667.482586] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1667.483992] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1667.485395] Call Trace:
[ 1667.485908]  ? nova_repair_inode+0xbf/0xe1 [nova]
[ 1667.486837]  ? nova_check_inode_integrity+0x53d/0x620 [nova]
[ 1667.487924]  nova_rebuild_inode+0x15a/0x210 [nova]
[ 1667.488827]  nova_iget+0xa7/0x190 [nova]
[ 1667.489573]  nova_lookup+0xd6/0x1a0 [nova]
[ 1667.490348]  ? legitimize_path.isra.27+0x2e/0x60
[ 1667.491216]  lookup_slow+0xa5/0x160
[ 1667.491880]  walk_component+0x1bf/0x350
[ 1667.492607]  path_lookupat+0x4b/0x100
[ 1667.493304]  filename_lookup+0xb1/0x180
[ 1667.494031]  ? mem_cgroup_commit_charge+0x7e/0x510
[ 1667.494932]  ? __check_object_size+0x100/0x1d7
[ 1667.495770]  ? strncpy_from_user+0x4d/0x170
[ 1667.496582]  user_path_at_empty+0x36/0x40
[ 1667.497342]  vfs_fstatat+0x66/0xc0
[ 1667.497992]  SYSC_newfstatat+0x24/0x60
[ 1667.498704]  ? __do_page_fault+0x2ab/0x520
[ 1667.499478]  SyS_newfstatat+0xe/0x10
[ 1667.500158]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 1667.501029] RIP: 0033:0x7f3edeca43cb
[ 1667.501712] RSP: 002b:00007fff0cc22918 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[ 1667.503117] RAX: ffffffffffffffda RBX: 000055c2aaaba0e0 RCX: 00007f3edeca43cb
[ 1667.504446] RDX: 000055c2aaabb398 RSI: 000055c2aaabb428 RDI: ffffffffffffff9c
[ 1667.505770] RBP: 00007f3edef6db00 R08: 0000000000000100 R09: 0000000000000130
[ 1667.507094] R10: 0000000000000100 R11: 0000000000000246 R12: 00007f3edef6db58
[ 1667.508421] R13: 00007fff0cc22bf8 R14: 0000000000002710 R15: 0000000000001110
[ 1667.509749] Code: 40 fe ff ff e8 62 1b 48 d3 48 8b 95 40 fe ff ff e9 e8 fa ff ff 48 8b 95 20 fe ff ff 48 c7 c6 c0 00 89 c0 4c 89 e7 e8 c0 5a 00 00 <0f> 0b e8 29 f3 ff ff 48 8b b5 20 fe ff ff 48 c7 c7 50 cf 88 c0
[ 1667.513215] RIP: nova_rebuild_file_inode_tree+0x5a0/0x5e0 [nova] RSP: ffffba895151f8c0
[ 1667.514726] ---[ end trace 78976917f1fdd30d ]---
luzh commented 7 years ago

Is this produced by linux-nova::devel and xfstests::hacked-for-gce branches? I couldn't reproduce it and generic/003 looks passing.

generic/003 9s ... 10s Ran: generic/003 Passed all 1 test

stevenjswanson commented 7 years ago

You need to run NOVA/001 and then generic/003.

Try this:

# git clone git@github.com:NVSL/nova-testscripts.git
# cd nova-testscripts/nova-ci
# ./run_tests.sh xfstests NOVA/001 generic/003

The results should end up in results/latest, including the output of dmesg for the run. There's i directory for each run named by time. If you don't find it in latest look in one of the recent ones.

Let me know if it doesn't work.

luzh commented 7 years ago

Reproduced and fixed by https://github.com/NVSL/linux-nova/commit/79c48e403f114bbbffd845f1aa98f3ecb1962297

The cause was inode's checksum was not updated in some gc functions. I also added iput(target_inode) to nova_seq_gc(), to avoid VFS: Busy inodes after unmount of pmem1..., but I'm not sure if it's really appropriate. Please review.