Closed skypodolsky closed 1 year ago
It's sometimes seen that during elio-test.sh the driver fails on umount with the following kernel panic:
umount
[ 61.655725] BUG: unable to handle page fault for address: 0000000000001000 [ 61.655753] #PF: supervisor read access in kernel mode [ 61.655765] #PF: error_code(0x0000) - not-present page [ 61.655778] PGD 0 P4D 0 [ 61.655793] Oops: 0000 [#1] SMP NOPTI [ 61.655805] CPU: 3 PID: 2147 Comm: umount Tainted: G OE 5.4.0-110-generic #124-Ubuntu [ 61.655826] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 61.655850] RIP: 0010:file_write_block.cold+0x29/0x122 [elastio_snap] [ 61.655866] Code: ff 48 8b 55 d0 4c 89 fe 48 c7 c7 f0 ad aa c0 e8 c4 27 fe f6 49 8b 4f 68 48 85 c9 74 67 83 3d 38 79 00 00 00 0f 84 cf b1 ff ff <48> 8b 31 48 c7 c7 b5 d0 aa c0 e8 9f 27 fe f6 49 8b 4f 68 e9 b7 b1 [ 61.655903] RSP: 0018:ffff9def80b8bdd0 EFLAGS: 00010202 [ 61.655919] RAX: 0000000000000053 RBX: 0000000000001000 RCX: 0000000000001000 [ 61.655934] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff91866fadc8c0 [ 61.655949] RBP: ffff9def80b8be18 R08: 0000000000000d51 R09: 0000000000000004 [ 61.655964] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 61.655980] R13: 0000000000000008 R14: 0000000000000000 R15: ffff918659091400 [ 61.655997] FS: 00007f0c1815f840(0000) GS:ffff91866fac0000(0000) knlGS:0000000000000000 [ 61.656015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 61.656028] CR2: 0000000000001000 CR3: 000000042b528000 CR4: 0000000000340ee0 [ 61.656045] Call Trace: [ 61.656065] ? vprintk_func+0x4c/0xc0 [ 61.656078] __cow_sync_and_free_sections+0x7b/0xe0 [elastio_snap] [ 61.656093] __tracer_destroy_cow+0xbf/0x1d0 [elastio_snap] [ 61.656107] handle_bdev_mount_event+0x202/0x2b0 [elastio_snap] [ 61.656123] umount_hook+0x93/0x110 [elastio_snap] [ 61.656140] do_syscall_64+0x57/0x190 [ 61.656161] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 61.656173] RIP: 0033:0x7f0c183be16b [ 61.656184] Code: cd 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f5 cc 0c 00 f7 d8 64 89 01 48 [ 61.656220] RSP: 002b:00007ffd6f0b6b68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 61.656237] RAX: ffffffffffffffda RBX: 00007f0c184f0204 RCX: 00007f0c183be16b [ 61.656252] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000056010dc8ac40 [ 61.656268] RBP: 000056010dc8aa30 R08: 0000000000000000 R09: 00007ffd6f0b5910 [ 61.656292] R10: 00007f0c184dc379 R11: 0000000000000246 R12: 000056010dc8ac40 [ 61.656308] R13: 0000000000000000 R14: 000056010dc8ab28 R15: 0000000000000000 [ 61.656324] Modules linked in: elastio_snap(OE) xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge stp llc aufs overlay binfmt_misc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds kvm_amd ccp mac_hid serio_raw kvm qemu_fw_cfg sch_fq_codel msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel virtio_net net_failover failover cirrus drm_kms_helper aesni_intel syscopyarea sysfillrect crypto_simd sysimgblt fb_sys_fops drm cryptd glue_helper psmouse virtio_blk pata_acpi i2c_piix4 floppy [last unloaded: elastio_snap] [ 61.656493] CR2: 0000000000001000 [ 61.656506] ---[ end trace d1df488f6a4d14fe ]--- [ 61.657073] RIP: 0010:file_write_block.cold+0x29/0x122 [elastio_snap] [ 61.657629] Code: ff 48 8b 55 d0 4c 89 fe 48 c7 c7 f0 ad aa c0 e8 c4 27 fe f6 49 8b 4f 68 48 85 c9 74 67 83 3d 38 79 00 00 00 0f 84 cf b1 ff ff <48> 8b 31 48 c7 c7 b5 d0 aa c0 e8 9f 27 fe f6 49 8b 4f 68 e9 b7 b1 [ 61.658772] RSP: 0018:ffff9def80b8bdd0 EFLAGS: 00010202 [ 61.659335] RAX: 0000000000000053 RBX: 0000000000001000 RCX: 0000000000001000 [ 61.659906] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff91866fadc8c0 [ 61.660497] RBP: ffff9def80b8be18 R08: 0000000000000d51 R09: 0000000000000004 [ 61.660984] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 61.661272] R13: 0000000000000008 R14: 0000000000000000 R15: ffff918659091400 [ 61.661556] FS: 00007f0c1815f840(0000) GS:ffff91866fac0000(0000) knlGS:0000000000000000 [ 61.661840] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 61.662119] CR2: 0000000000001000 CR3: 000000042b528000 CR4: 0000000000340ee0 [ 61.668460] elastio-snap: detected block device umount: /tmp/elastio-snap_010 : 0 [ 61.669177] elastio-snap: block device umount detected for device 10
Despite this happens on ext4 and xfs (other filesystems were not verified), xfs seems to facilitate the reproduce. Probably, the sd_cow pointer is corrupted when switching to the incremental mode.
sd_cow
Relates to the epic #219
It's sometimes seen that during elio-test.sh the driver fails on
umount
with the following kernel panic:Despite this happens on ext4 and xfs (other filesystems were not verified), xfs seems to facilitate the reproduce. Probably, the
sd_cow
pointer is corrupted when switching to the incremental mode.Relates to the epic #219