Closed OrdinaryMagician closed 8 years ago
@OrdinaryMagician I'm sorry to hear you've run in to problems. The good news is that based on the backtrace and the fact that you were able to import the pool it appears to be intact. But before starting to debug this further I suggest you make a backup of your data by booting in to a rescue image and importing the pool read-only. You should be able to copy off all of your data somewhere safe.
zpool import -o readonly=on <pool>
Once that's done you can either just recreate the pool and copy the data back, or we can debug this further by building a debug version of ZFS and enabling some additional logging.
To build a debug version you'll want to start from source as follows (roughly).
$ git clone https://github.com/zfsonlinux/spl.git
$ git clone https://github.com/zfsonlinux/zfs.git
$ cd spl
$ ./autogen.sh
$ ./configure --enable-debug
$ make -s -j$(nproc)
$ cd ../zfs
$ ./autogen.sh
$ ./configure --enable-debug --with-spl=<path to spl> --with-spl-obj=<path to spl>
$ make -s -j$(nproc)
At this point you'll be able to load the debug modules by running scripts/zfs.sh
in the ZFS source tree. You'll also be able to to execute the zfs
, zpool
, zdb
commands directly from the ZFS source tree. So go ahead and load the debug modules and then import the pool. This time you may see an VERIFY error on the console before it panics which should give us some additional information and we can go from there enabling adding logging as needed.
The crash point is dsl_deadlist_insert -> dle_enqueue
but dle
is NULL.
Though, I don't know how to interpret it. Maybe bp->blk_birth
is corrupted?
It's ok, I already send all my snapshots off to an external disk, so everything should be fine for now. I'll begin building the debug modules, hopefully systemrescuecd has all the tools I need.
Hm... unfortunately no, it does not. It's going to take me a while to build some sort of live cd with all I need, but I'll still do it.
@OrdinaryMagician You can grab a modified System Rescue CD from the "System Rescue CD w/ ZFS Mirrors" mirror links from this page:
I already have one of those, unfortunately, it does not come with anything I can use to build debug modules.
I tried just building the whole thing on the desktop then regenerating the initial ramdisk, and checking what happens at boot. There are no messages on import before it hangs.
Before actually throwing the towel and recreating the pool, I'm having someone build a custom systemrescuecd+zfs iso, since they use gentoo and it's easier to make there.
@OrdinaryMagician Out of curiosity, does this system use ECC? If it does, we can rule out a really unlucky bit flip.
If you have backups, you can rebuild the pool from the backups, although it might be possible to do some analysis with zdb. I suggest talking to people in #zfsonlinux on the freenode IRC network. I hang out there, although I probably will not be free to assist until tomorrow.
The crash point is dsl_deadlist_insert -> dle_enqueue but dle is NULL. Though, I don't know how to interpret it. Maybe bp->blk_birth is corrupted?
Yes, very possibly the bp->blk_birth
is corrupted in a very specific way. If a bit were cleared in bp->blk_birth
such that the txg birth was now lower than any other txg birth in the deadlist then there would be no entry before this in the tree. This is normally impossible because of how the deadlists are managed and so there's no error handling code here. But in the case of a memory or some other very specific damage this could happen. And since it persists across reboots that strongly suggests the block point is damaged. Plus zfs_blkptr_verify()
wouldn't catch this since there's no reliable way to check the blk_birth
.
@OrdinaryMagician before you throw in the towel could you apply the patch I've just posted in #4089 to your debug build and then reproduce the issue. It will log the damaged txg_birth
number to the console so we can see what it is. Also please run the command zdb -u <pool-name>
which will report the current txg number for comparison.
You should then be able to prevent the issue by setting the zfs_recover=1
module option and importing your pool again. This will enable the new error handling code which will allow the damaged txg_birth value to be used. This may result in a tiny bit of leaked space in your pool but it should function fine assuming only this one block pointer was damaged. You can check console for messages like this to verify only one block point was effected.
blkptr at X has invalid BLK_BIRTH Y
@ryao @tuxoko can you please review this proposed change. As I mentioned above by default there's no functional change aside from detecting and logging the error unless zfs_recover=1
is set.
@ryao my system's pretty old, so I don't think there's ECC. It's actually not the first time I get some sudden bit flip that screws everything up. I once had a bootloader stop working overnight, just from one byte suddenly having changed.
Ah, there we go.
PANIC: blkptr at ffff8801fc994040 has invalid BLK_BIRTH 0
With zfs_recover=1 I've re-imported and I'm going to go through all the steps that triggered this whole mess all over again, wish me luck.
Everything seems OK so far, no failures. It seems it was only just that one block pointer, after all.
@OrdinaryMagician glad to hear that it was just that block pointer. Now that it's been successfully processed you should be able to return to the packages from the distribution whenever you like. You won't need to set zfs_recovery
again.
Thank you. Next computer I build, it'll surely have ECC memory.
I have also encountered an oops in the same place. I don't have ECC memory either so I could also be the victim of an unlucky bit flip as well, but I thought I'd mention it so you have another data point. My pool seems to continue to generally work, though a scrub will hang forever.
[20034.654416] Oops: 0000 [#1] SMP
[20034.654457] Modules linked in: pci_stub vboxpci(OX) vboxnetadp(OX) vboxnetflt(OX) vboxdrv(OX) ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables binfmt_misc dm_crypt x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm eeepc_wmi asus_wmi sparse_keymap crct10dif_pclmul crc32_pclmul aesni_intel dm_multipath scsi_dh aes_x86_64 snd_hda_codec_realtek lrw snd_hda_codec_hdmi gf128mul glue_helper ablk_helper cryptd snd_hda_intel snd_hda_codec snd_hwdep snd_pcm serio_raw snd_page_alloc snd_timer snd soundcore shpchp mei_me mei lpc_ich mac_hid parport_pc ppdev lp parport nls_iso8859_1 zfs(POX) zunicode(POX) zcommon(POX) znvpair(POX) spl(OX) zavl(POX) i915 i2c_algo_bit e1000e drm_kms_helper ahci ptp psmouse drm libahci pps_core video wmi
[20034.655567] CPU: 2 PID: 1486 Comm: txg_sync Tainted: P OX 3.13.0-74-generic #118-Ubuntu
[20034.655659] Hardware name: ASUS All Series/H87I-PLUS, BIOS 1005 01/06/2014
[20034.655732] task: ffff880404b74800 ti: ffff88039c588000 task.ti: ffff88039c588000
[20034.655809] RIP: 0010:[<ffffffffa02ae191>] [<ffffffffa02ae191>] dsl_deadlist_insert+0x161/0x370 [zfs]
[20034.655957] RSP: 0018:ffff88039c589890 EFLAGS: 00010246
[20034.656013] RAX: 0000000000000000 RBX: ffff8803ffe1b8f0 RCX: 0000000000000001
[20034.656086] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8804044bd000
[20034.656160] RBP: ffff88039c589950 R08: ffff88041fb15fa0 R09: ffff88040f401700
[20034.656233] R10: ffff88040f401700 R11: ffffffffa01d9b10 R12: ffff8800cb763240
[20034.656305] R13: ffff8803a022ad80 R14: 0000000000000000 R15: ffff8803ffe1b968
[20034.656379] FS: 0000000000000000(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000
[20034.656462] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[20034.656521] CR2: 0000000000000060 CR3: 0000000002c0e000 CR4: 00000000001407e0
[20034.656595] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[20034.656668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[20034.656740] Stack:
[20034.656763] ffff880035117a00 ffff880404938c00 ffffffff8172b392 ffff880405e78da0
[20034.656852] ffff88039c589918 0000000000000000 0000000305e78da0 ffff880405e78ec8
[20034.656940] ffffffffffff1c00 ffff8803a022ad80 ffff880405e78ea0 ffff880406b45af0
[20034.657028] Call Trace:
[20034.657062] [<ffffffff8172b392>] ? mutex_lock+0x12/0x2f
[20034.657157] [<ffffffffa0289316>] ? dmu_buf_will_dirty+0x56/0xa0 [zfs]
[20034.657274] [<ffffffffa02a5c8b>] dsl_dataset_block_kill+0x24b/0x4c0 [zfs]
[20034.657392] [<ffffffffa02a3e4f>] free_blocks+0x14f/0x270 [zfs]
[20034.657497] [<ffffffffa02a468a>] dnode_sync_free_range+0x25a/0x2b0 [zfs]
[20034.657603] [<ffffffffa0288cd3>] ? dbuf_dirty+0x6e3/0x9c0 [zfs]
[20034.657746] [<ffffffffa0313cef>] ? zfs_space_delta_cb+0xaf/0x190 [zfs]
[20034.657859] [<ffffffffa02a4430>] ? free_children+0x330/0x330 [zfs]
[20034.657983] [<ffffffffa02c7746>] range_tree_vacate+0x56/0x110 [zfs]
[20034.658094] [<ffffffffa02a4bf4>] dnode_sync+0x274/0x9a0 [zfs]
[20034.658191] [<ffffffffa0289ded>] ? dbuf_sync_list+0xcd/0x100 [zfs]
[20034.658299] [<ffffffffa02a4c71>] ? dnode_sync+0x2f1/0x9a0 [zfs]
[20034.658403] [<ffffffffa0294452>] dmu_objset_sync_dnodes+0xd2/0xf0 [zfs]
[20034.658513] [<ffffffffa029461c>] dmu_objset_sync+0x1ac/0x2f0 [zfs]
[20034.658617] [<ffffffffa0292aa0>] ? recordsize_changed_cb+0x20/0x20 [zfs]
[20034.658726] [<ffffffffa0294760>] ? dmu_objset_sync+0x2f0/0x2f0 [zfs]
[20034.658839] [<ffffffffa02ad41e>] dsl_dataset_sync+0x4e/0x90 [zfs]
[20034.658954] [<ffffffffa02b67a8>] dsl_pool_sync+0x98/0x470 [zfs]
[20034.659077] [<ffffffffa02d12b5>] spa_sync+0x365/0xb20 [zfs]
[20034.659143] [<ffffffff810ab472>] ? autoremove_wake_function+0x12/0x40
[20034.659216] [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[20034.659335] [<ffffffffa02e3639>] txg_sync_thread+0x3b9/0x620 [zfs]
[20034.659464] [<ffffffffa02e3280>] ? txg_quiesce_thread+0x3f0/0x3f0 [zfs]
[20034.659548] [<ffffffffa01dbe81>] thread_generic_wrapper+0x71/0x80 [spl]
[20034.659631] [<ffffffffa01dbe10>] ? __thread_exit+0x20/0x20 [spl]
[20034.659698] [<ffffffff8108b8a2>] kthread+0xd2/0xf0
[20034.659754] [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[20034.659825] [<ffffffff817356a8>] ret_from_fork+0x58/0x90
[20034.659884] [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[20034.659951] Code: 4c 89 f7 48 89 85 68 ff ff ff e8 bb 4f d7 ff 31 d2 48 85 c0 0f 84 b0 01 00 00 4c 89 f7 48 89 c6 e8 75 4e d7 ff 49 89 c6 48 8b 3b <4d> 8b 7e 60 e8 66 6f fe ff 4c 3b b8 40 01 00 00 0f 84 d9 00 00
[20034.660326] RIP [<ffffffffa02ae191>] dsl_deadlist_insert+0x161/0x370 [zfs]
[20034.660449] RSP <ffff88039c589890>
[20034.660487] CR2: 0000000000000060
[20034.792087] ---[ end trace 7292a3293341304d ]---
Eventually my system completely hung waiting on IO to the pool. @behlendorf 's patch fixed my problem (although the --with-spl options to zfs's configure caused make to fail),
Today I was just installing system updates as usual when all of a sudden my entire system froze. Naturally I rebooted and when I came back, I was greeted by a kernel oops (which I couldn't do anything about because it interrupted the boot process entirely). I decided to boot into systemrescuecd and check it out, and noticed the oops would show up again if I tried to mount my root filesystem. I lost two weeks of work rolling back to a snapshot and tried again. Same thing happened.
I've managed to isolate the oops message, though, and saved it to a file on an usb stick.
It will happen every single time. During the updates it suddenly gets stuck after a few packages, and most running programs just freeze. I've scrubbed the pool just in case, but there are no errors whatsoever.
I will help in any way I can to have this fixed. My only computer is now unusable due to this issue.