Closed lnicola closed 6 years ago
As I said in #7059:
I noticed that your zfs version is the same as the guy in this issue who was having some strange problems as well. That version seems to point to a commit that doesn't exist in the ZoL codebase, and I don't know where it is coming from. I would suggest trying the latest code from master (assuming you are ok running on the bleeding edge) or the latest point release v0.7.6.
If you hit this issue again after that, let me know and I'll investigate further.
I suppose you've realized by now that I'm the same guy in all of the threads 😄 .
The situation is as follows. I have two pools, call them bike
and tank
. bike
is faster, so I keep my root filesystem and a couple of other on it, make daily snapshots and send them from time to time over to tank
. The last snapshot on tank
is from last night. EDIT: I was using large dnodes, so sends from bike
to tank
were not working for a while, until #6864. After that was merged, sends worked again, but were very slow on some filesystems.
I've first seen the userquota_updates_task
error yesterday, but shrugged it off as "harmless".
This morning I started a scrub on tank
, then one on bike
which got stuck. At that time I thought the first scrub is somehow blocking the second, but now I suspect that wasn't true (or maybe it was, see the end of the post). The was nothing in dmesg
about this. Cancelling the first scrub worked, but everything got stuck waiting for disk on shutdown, so I power-cycled the box.
Now I'm no longer able to boot from bike
, as it panics with VERIFY3(0 == dmu_object_claim_dnsize()) failed (0 == 28)
. But I can boot last night's snapshot of it from tank
. I'm currently sending everything over to another pool. This worked fine until it hit the root filesystem on tank, which seems to be progressing very slowly. It's interesting that it's reading more data than it's writing, which didn't seem to happen on the other filesystems:
$ sudo zpool iostat tank freighter 10
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 1 192 7.26K 64.0M
tank 1.12T 714G 139 0 61.1M 12.9K
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 0 294 0 3.48M
tank 1.12T 714G 128 0 16.6M 0
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 2 281 9.99K 3.31M
tank 1.12T 714G 115 0 14.5M 0
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 0 260 1.60K 3.20M
tank 1.12T 714G 143 0 17.1M 0
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 0 273 0 3.58M
tank 1.12T 714G 146 0 20.7M 0
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 0 287 3.60K 3.79M
tank 1.12T 714G 110 0 16.6M 0
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 6 221 27.2K 2.71M
tank 1.12T 714G 86 0 12.0M 0
---------- ----- ----- ----- ----- ----- -----
freighter 1.20T 3.33T 9 223 92.3K 11.0M
tank 1.12T 714G 359 1 22.8M 49.6K
---------- ----- ----- ----- ----- ----- -----
Additionally, I got the error in this issue at boot:
[ 59.952154] general protection fault: 0000 [#1] SMP PTI
[ 59.952213] Modules linked in: tun nft_meta nct6775 hwmon_vid nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nft_set_hash nft_set_rbtree xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nls_iso8859_1 nls_cp437 vfat fat intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf i915 snd_usb_audio snd_usbmidi_lib intel_powerclamp ofpart cmdlinepart coretemp intel_spi_platform intel_spi snd_hwdep spi_nor iTCO_wdt crct10dif_pclmul drm_kms_helper snd_rawmidi crc32_pclmul drm iTCO_vendor_support mtd ppdev r8169 snd_seq_device crc32c_intel snd_pcm snd_timer mii snd cdc_acm intel_gtt ghash_clmulni_intel cryptd intel_cstate input_leds soundcore led_class agpgart hci_uart
[ 59.952826] joydev syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit rtc_cmos btbcm shpchp mei_txe btqca mei btintel lpc_ich i2c_i801 bluetooth parport_pc parport ecdh_generic i2c_hid rfkill crc16 evdev pwm_lpss_platform pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[ 59.953176] CPU: 1 PID: 322 Comm: dp_sync_taskq Tainted: P O 4.15.1-2-ARCH #1
[ 59.953245] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[ 59.953404] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[ 59.953455] RSP: 0018:ffffb0b145b7bd30 EFLAGS: 00010286
[ 59.953501] RAX: ffff978bfb9688c0 RBX: ffff978bfacb28e0 RCX: 0000000000000000
[ 59.953561] RDX: dead000000000100 RSI: ffff978bfacb29f0 RDI: ffff978bfb968880
[ 59.953622] RBP: ffff978c68f1b800 R08: 0000000000000020 R09: 0000000000000008
[ 59.953681] R10: 0000000000000000 R11: ffffb0b145b7bcfe R12: ffff978bfacb2a00
[ 59.953741] R13: ffff978c1329aa80 R14: ffff978bfb968880 R15: ffff978bfacb2a20
[ 59.953802] FS: 0000000000000000(0000) GS:ffff978c7fc80000(0000) knlGS:0000000000000000
[ 59.953870] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 59.953919] CR2: 00007fe15b6d25b4 CR3: 00000003d389e000 CR4: 00000000001006e0
[ 59.953979] Call Trace:
[ 59.954061] userquota_updates_task+0xd2/0x520 [zfs]
[ 59.954159] ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[ 59.954264] ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[ 59.954324] taskq_thread+0x295/0x4a0 [spl]
[ 59.954367] ? wake_up_q+0x80/0x80
[ 59.954406] ? taskq_thread_should_stop+0x70/0x70 [spl]
[ 59.954452] kthread+0x113/0x130
[ 59.954484] ? kthread_create_on_node+0x70/0x70
[ 59.954527] ret_from_fork+0x35/0x40
[ 59.954561] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48
[ 59.954818] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb0b145b7bd30
[ 59.954886] general protection fault: 0000 [#2] SMP PTI
[ 59.954909] ---[ end trace c381eb953d40c75a ]---
[ 59.955056] Modules linked in: tun nft_meta nct6775 hwmon_vid nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nft_set_hash nft_set_rbtree xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nls_iso8859_1 nls_cp437 vfat fat intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf i915 snd_usb_audio snd_usbmidi_lib intel_powerclamp ofpart cmdlinepart coretemp intel_spi_platform intel_spi snd_hwdep spi_nor iTCO_wdt crct10dif_pclmul drm_kms_helper snd_rawmidi crc32_pclmul drm iTCO_vendor_support mtd ppdev r8169 snd_seq_device crc32c_intel snd_pcm snd_timer mii snd cdc_acm intel_gtt ghash_clmulni_intel cryptd intel_cstate input_leds soundcore led_class agpgart hci_uart
[ 59.956082] joydev syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit rtc_cmos btbcm shpchp mei_txe btqca mei btintel lpc_ich i2c_i801 bluetooth parport_pc parport ecdh_generic i2c_hid rfkill crc16 evdev pwm_lpss_platform pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[ 59.956662] CPU: 3 PID: 321 Comm: dp_sync_taskq Tainted: P D O 4.15.1-2-ARCH #1
[ 59.956779] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[ 59.957028] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[ 59.957112] RSP: 0000:ffffb0b145b73d30 EFLAGS: 00010286
[ 59.957189] RAX: ffff978bfb968940 RBX: ffff978c14adf680 RCX: 0000000000000000
[ 59.957290] RDX: dead000000000100 RSI: ffff978c14adf790 RDI: ffff978bfb968900
[ 59.957390] RBP: ffff978c68f1b800 R08: 0000000000000020 R09: 0000000000000000
[ 59.957490] R10: ffff978be53391e0 R11: ffffb0b145b73cff R12: ffff978c14adf7a0
[ 59.957591] R13: ffff978c1329aa80 R14: ffff978bfb968900 R15: ffff978c14adf7c0
[ 59.957692] FS: 0000000000000000(0000) GS:ffff978c7fd80000(0000) knlGS:0000000000000000
[ 59.957805] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 59.957888] CR2: 00007f1296906000 CR3: 00000003a9448000 CR4: 00000000001006e0
[ 59.957988] Call Trace:
[ 59.958123] userquota_updates_task+0xd2/0x520 [zfs]
[ 59.958287] ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[ 59.958463] ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[ 59.958563] taskq_thread+0x295/0x4a0 [spl]
[ 59.958632] ? wake_up_q+0x80/0x80
[ 59.958697] ? taskq_thread_should_stop+0x70/0x70 [spl]
[ 59.958775] kthread+0x113/0x130
[ 59.958828] ? kthread_create_on_node+0x70/0x70
[ 59.958896] ? kthread_create_on_node+0x70/0x70
[ 59.958966] ret_from_fork+0x35/0x40
[ 59.959022] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48
[ 59.959454] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb0b145b73d30
[ 59.959599] ---[ end trace c381eb953d40c75b ]---
[ 59.959999] general protection fault: 0000 [#3] SMP PTI
[ 59.964598] Modules linked in: tun nft_meta nct6775 hwmon_vid nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nft_set_hash nft_set_rbtree xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nls_iso8859_1 nls_cp437 vfat fat intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf i915 snd_usb_audio snd_usbmidi_lib intel_powerclamp ofpart cmdlinepart coretemp intel_spi_platform intel_spi snd_hwdep spi_nor iTCO_wdt crct10dif_pclmul drm_kms_helper snd_rawmidi crc32_pclmul drm iTCO_vendor_support mtd ppdev r8169 snd_seq_device crc32c_intel snd_pcm snd_timer mii snd cdc_acm intel_gtt ghash_clmulni_intel cryptd intel_cstate input_leds soundcore led_class agpgart hci_uart
[ 59.989065] joydev syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit rtc_cmos btbcm shpchp mei_txe btqca mei btintel lpc_ich i2c_i801 bluetooth parport_pc parport ecdh_generic i2c_hid rfkill crc16 evdev pwm_lpss_platform pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[ 60.003915] CPU: 3 PID: 323 Comm: dp_sync_taskq Tainted: P D O 4.15.1-2-ARCH #1
[ 60.008742] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[ 60.013596] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[ 60.018280] RSP: 0000:ffffb0b145b83d30 EFLAGS: 00010286
[ 60.022938] RAX: ffff978bfb9689c0 RBX: ffff978c0cdc57d8 RCX: 0000000000000000
[ 60.027591] RDX: dead000000000100 RSI: ffff978c0cdc58e8 RDI: ffff978bfb968980
[ 60.032209] RBP: ffff978c68f1b800 R08: 0000000000000020 R09: 0000000000000003
[ 60.036838] R10: ffff978c14c0ac20 R11: ffffb0b145b83cff R12: ffff978c0cdc58f8
[ 60.041458] R13: ffff978c1329aa80 R14: ffff978bfb968980 R15: ffff978c0cdc5918
[ 60.046130] FS: 0000000000000000(0000) GS:ffff978c7fd80000(0000) knlGS:0000000000000000
[ 60.050917] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 60.055546] CR2: 00007f1296906000 CR3: 00000003a9448000 CR4: 00000000001006e0
[ 60.060209] Call Trace:
[ 60.064905] userquota_updates_task+0xd2/0x520 [zfs]
[ 60.069575] ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[ 60.074250] ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[ 60.078835] taskq_thread+0x295/0x4a0 [spl]
[ 60.083416] ? wake_up_q+0x80/0x80
[ 60.087983] ? taskq_thread_should_stop+0x70/0x70 [spl]
[ 60.092578] kthread+0x113/0x130
[ 60.097155] ? kthread_create_on_node+0x70/0x70
[ 60.101754] ret_from_fork+0x35/0x40
[ 60.106337] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48
[ 60.115975] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb0b145b83d30
[ 60.120975] ---[ end trace c381eb953d40c75c ]---
Both pools are online right now, zpool status
reports no issues. I do have a nagging dmesg
warning: do_IRQ: 3.33 No irq handler for vector
, which is a hopefully unrelated kernel issue. The root filesystem on bike
is not mounted.
EDIT: As mentioned below, I tried to start a scrub on the backup pool, then one on tank
which got stuck. A snapshot on a tank
filesystem also got stuck.
The plot thickens...
I started a scrub on the back-up pool, then tried to start one on tank
. zpool scrub
got stuck again:
[<0>] cv_wait_common+0x115/0x130 [spl]
[<0>] txg_wait_synced+0xdc/0x120 [zfs]
[<0>] dsl_sync_task+0x15c/0x240 [zfs]
[<0>] dsl_scan+0x91/0x100 [zfs]
[<0>] zfs_ioc_pool_scan+0x53/0xc0 [zfs]
[<0>] zfsdev_ioctl+0x1d8/0x610 [zfs]
[<0>] do_vfs_ioctl+0xa4/0x630
[<0>] SyS_ioctl+0x74/0x80
[<0>] entry_SYSCALL_64_fastpath+0x20/0x83
[<0>] 0xffffffffffffffff
while zpool status
says that the scrub didn't start yet.
Since the root fs on bike
is no longer involved, I'm not sure what to do next.
while you have the system in this state can you run echo t > /proc/sysrq-trigger
and post the output that will end up in your kernel logs? It will be a lot of output.. This will show me a stack trace of all the processes running on your system, so I understand if you do not wish to do this.
If your kernel is crashed there is no need to do this, the crash is almost certainly causing the problem.
Some things are still working, but it's mostly crashed, I suppose. I was able to export the back-up pool.
OK. Sorry to ask this but humor me for a second. You have opened several issues and at this point I'm having a hard time keeping it all straight. Could you quickly go through the timeline of what you have, what you did, and when the issues you've encountered happened? I'm going to guess that they are all probably related, so having the full timeline for all of them will probably help a lot.
That would be https://github.com/zfsonlinux/zfs/issues/7147#issuecomment-364393461 above, I suppose.
Just so I have it right, here is my summary of that (along with the questions I still have). Can you confirm this and answer the questions in parenthesis as best you can? I'm also not sure at what point you started encountering the dmu_object_claim_dnsize()
issue, if you could provide that info as well.
bike crashed with userquota_updates_task panic
I saw the error message, but there was no apparent crash, and everything seemed to work afterwards. I didn't reboot. I don't know for which pool it was.
user attempted to run a scrub and noticed that it didn't return (almost definitely due to the crash)
I started a scrub on tank
, then tried to start one on bike
, but it never started.
user power-cycled the box and could no longer boot from bike (due to what issue? userquota_updates_task panic?)
Due to a dmu_object_claim_dnsize()
panic when mounting bike
.
user booted into a backup of his root filesystem on tank
Right. The userquota_updates_task
showed up again a couple of times on boot, again I don't know for which pool.
user sent his dataset to another pool and noticed the root dataset was slow (probably unrelated)
Before the first crash, sending bike/zroot
and bike/home
to tank
was slow. Sending tank/zroot
(copy of bike/zroot
) to a backup pool was probably fine, but sending tank/bike/home
was slow. bike/home
works fine nonetheless.
once the send completed, user attempted to scrub tank and it didn't return again (did it crash again this time?)
No crash, same txg_wait_synced
hang. Making a snapshot of another filesystem on tank
also hangs, but copying the data from it with cp
seems to work.
Alright I think I have a grasp on everything now and I'll try to take a look today.
My apologies. I have run out of time for today and will need to look into this more on Monday.
@lnicola Would you mind running:
gdb `modinfo zfs | grep filename | awk '{ print $2 }'`
list *(userquota_updates_task+0xd2)
This should help em figure out where the initial crash happened.
Sorry, I updated my kernel and ZFS version in the meanwhile, and my kernels have no debug info anyway. But there seems to be a single multilist_sublist_remove
call in that function, so that might be a good place to start.
EDIT: The file system in cause got rolled back somehow, so I'm no longer having any issues. If you feel like closing this, that's fine.
Dump of assembler code for function multilist_sublist_remove:
0x00000000000668c0 <+0>: e8 00 00 00 00 callq 0x668c5 <multilist_sublist_remove+5>
0x00000000000668c5 <+5>: 48 03 77 38 add 0x38(%rdi),%rsi
0x00000000000668c9 <+9>: 48 8b 46 08 mov 0x8(%rsi),%rax
0x00000000000668cd <+13>: 48 8b 16 mov (%rsi),%rdx # crash here, or below
0x00000000000668d0 <+16>: 48 89 42 08 mov %rax,0x8(%rdx)
0x00000000000668d4 <+20>: 48 89 10 mov %rdx,(%rax)
0x00000000000668d7 <+23>: 48 b8 00 01 00 00 00 00 ad de movabs $0xdead000000000100,%rax
0x00000000000668e1 <+33>: 48 89 06 mov %rax,(%rsi)
0x00000000000668e4 <+36>: 48 05 00 01 00 00 add $0x100,%rax
0x00000000000668ea <+42>: 48 89 46 08 mov %rax,0x8(%rsi)
0x00000000000668ee <+46>: c3 retq
@lnicola would you mind trying this again with the most recent code from master. We recently fixed a related issue to this and I think the problem might be solved.
I'll try it on the next distro package update, but the only issue I still have is the slow recv
.
Ok. That problem should mostly be alleviated by #7197 so hopefully all your problems should be fixed soon.
@lnicola any update here?
I mentioned before, the file system or pool got rolled back at the time. I lost 9 hours of changes or so, and I didn't encounter any issues afterwards, even with that same ZFS version. So I don't know if it's fixed, but I no longer have a way to reproduce these issues.
I just got a new one, in 0.7.0-356_g639b18944
:
[103861.445982] general protection fault: 0000 [#1] PREEMPT SMP PTI
[103861.446061] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype xt_conntrack br_netfilter bridge stp llc tun nft_meta nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct xt_tcpudp nft_set_bitmap nft_set_hash nft_set_rbtree iptable_filter nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nct6775 hwmon_vid nls_iso8859_1 nls_cp437 vfat fat i915 ofpart cmdlinepart intel_spi_platform intel_spi spi_nor mtd iTCO_wdt iTCO_vendor_support ppdev snd_usb_audio snd_usbmidi_lib i2c_algo_bit snd_hwdep drm_kms_helper snd_rawmidi intel_rapl intel_soc_dts_thermal snd_seq_device intel_soc_dts_iosf intel_powerclamp snd_pcm coretemp crct10dif_pclmul crc32_pclmul
[103861.446689] hci_uart drm crc32c_intel ghash_clmulni_intel btbcm cryptd snd_timer intel_cstate btqca btintel pcspkr snd bluetooth soundcore mei_txe r8169 mei lpc_ich i2c_i801 intel_gtt agpgart mii syscopyarea cdc_acm sysfillrect shpchp sysimgblt fb_sys_fops rtc_cmos parport_pc parport ecdh_generic rfkill evdev crc16 mac_hid i2c_hid pwm_lpss_platform pwm_lpss ip_tables x_tables zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[103861.447137] CPU: 0 PID: 433 Comm: dp_sync_taskq Tainted: P O 4.15.7-1-ARCH #1
[103861.447209] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[103861.447378] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[103861.447433] RSP: 0018:ffffb7fd45e6bd20 EFLAGS: 00010286
[103861.447482] RAX: ffff94bc422f6e40 RBX: ffff94bd22559000 RCX: 0000000000000001
[103861.447545] RDX: dead000000000100 RSI: ffff94bb4fb44100 RDI: ffff94bc422f6e00
[103861.447607] RBP: ffff94bb4fb44000 R08: 0000000000000000 R09: 0000000000000003
[103861.447669] R10: ffffb7fd45e6bbe8 R11: ffffb7fd45e6bcdd R12: ffff94bb4fb44120
[103861.447731] R13: ffff94bce832c000 R14: ffff94bc422f6e00 R15: ffff94bb4fb44140
[103861.447794] FS: 0000000000000000(0000) GS:ffff94bd3fc00000(0000) knlGS:0000000000000000
[103861.447864] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[103861.447916] CR2: 00007f75bac13db8 CR3: 000000032ab9a000 CR4: 00000000001006f0
[103861.447978] Call Trace:
[103861.448075] userquota_updates_task+0xdf/0x5b0 [zfs]
[103861.448180] ? dmu_objset_is_dirty+0x20/0x20 [zfs]
[103861.448281] ? dmu_objset_is_dirty+0x20/0x20 [zfs]
[103861.448378] ? dmu_objset_is_dirty+0x20/0x20 [zfs]
[103861.448432] taskq_thread+0x295/0x4a0 [spl]
[103861.448478] ? wake_up_q+0x80/0x80
[103861.448523] ? taskq_thread_should_stop+0x70/0x70 [spl]
[103861.448572] kthread+0x113/0x130
[103861.448605] ? kthread_create_on_node+0x70/0x70
[103861.448652] ret_from_fork+0x35/0x40
[103861.448691] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48
[103861.448990] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb7fd45e6bd20
[103861.449108] ---[ end trace 4e6621a5dfbf95b1 ]---
The application (a rm
) is now hung.
EDIT: Unlike the last time, the pool came back fine (I think) after the reboot.
EDIT 2: Happened again during a rm
after running the same workload (compiling an AUR package).
Call traces of processes in D
state after the hang:
[104651.891826] txg_sync D 0 578 2 0x80000000
[104651.893818] Call Trace:
[104651.895780] ? __schedule+0x24b/0x8c0
[104651.898000] ? preempt_count_add+0x68/0xa0
[104651.899966] schedule+0x32/0x90
[104651.901931] taskq_wait+0x72/0xc0 [spl]
[104651.903896] ? wait_woken+0x80/0x80
[104651.905920] dsl_pool_sync+0x17c/0x420 [zfs]
[104651.907946] spa_sync+0x425/0xcf0 [zfs]
[104651.909962] txg_sync_thread+0x2cc/0x490 [zfs]
[104651.911978] ? txg_delay+0x1a0/0x1a0 [zfs]
[104651.913929] ? __thread_exit+0x20/0x20 [spl]
[104651.915899] thread_generic_wrapper+0x6f/0x80 [spl]
[104651.917872] kthread+0x113/0x130
[104651.919837] ? kthread_create_on_node+0x70/0x70
[104651.921804] ? kthread_create_on_node+0x70/0x70
[104651.923756] ret_from_fork+0x35/0x40
[104655.089910] python2.7 D 0 8222 1 0x00000000
[104655.091994] Call Trace:
[104655.094057] ? __schedule+0x24b/0x8c0
[104655.096120] ? preempt_count_add+0x68/0xa0
[104655.098185] schedule+0x32/0x90
[104655.100251] cv_wait_common+0x113/0x130 [spl]
[104655.102340] ? wait_woken+0x80/0x80
[104655.104488] txg_wait_open+0xab/0xf0 [zfs]
[104655.106624] dmu_free_long_range_impl+0x223/0x410 [zfs]
[104655.108776] dmu_free_long_range+0x6a/0xb0 [zfs]
[104655.110929] zfs_rmnode+0x24f/0x320 [zfs]
[104655.113102] ? zfs_zinactive+0xd3/0xf0 [zfs]
[104655.115287] zfs_inactive+0x7a/0x1f0 [zfs]
[104655.117415] ? unmap_mapping_range+0x73/0x140
[104655.119608] zpl_evict_inode+0x3c/0x50 [zfs]
[104655.121752] evict+0xc4/0x190
[104655.123880] do_unlinkat+0x1ab/0x320
[104655.126012] do_syscall_64+0x74/0x190
[104655.128143] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[104664.966746] postgres D 0 4892 4612 0x00000000
[104664.968923] Call Trace:
[104664.971086] ? __schedule+0x24b/0x8c0
[104664.973224] ? preempt_count_add+0x68/0xa0
[104664.975376] schedule+0x32/0x90
[104664.977518] cv_wait_common+0x113/0x130 [spl]
[104664.979654] ? wait_woken+0x80/0x80
[104664.981884] txg_wait_open+0xab/0xf0 [zfs]
[104664.984086] dmu_free_long_range_impl+0x223/0x410 [zfs]
[104664.986328] dmu_free_long_range+0x6a/0xb0 [zfs]
[104664.988543] zfs_rmnode+0x24f/0x320 [zfs]
[104664.990769] ? zfs_zinactive+0xd3/0xf0 [zfs]
[104664.992991] zfs_inactive+0x7a/0x1f0 [zfs]
[104664.995297] ? unmap_mapping_range+0x73/0x140
[104664.997504] zpl_evict_inode+0x3c/0x50 [zfs]
[104664.999648] evict+0xc4/0x190
[104665.001771] __dentry_kill+0xc3/0x150
[104665.003869] dput.part.4+0x131/0x1e0
[104665.005979] SyS_renameat2+0x3bc/0x590
[104665.008079] do_syscall_64+0x74/0x190
[104665.010177] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[104669.725973] rm D 0 10213 2062 0x00000000
[104669.727987] Call Trace:
[104669.729983] ? __schedule+0x24b/0x8c0
[104669.731985] ? preempt_count_add+0x68/0xa0
[104669.733992] schedule+0x32/0x90
[104669.735996] cv_wait_common+0x113/0x130 [spl]
[104669.738012] ? wait_woken+0x80/0x80
[104669.740096] txg_wait_open+0xab/0xf0 [zfs]
[104669.742170] dmu_free_long_range_impl+0x223/0x410 [zfs]
[104669.744268] dmu_free_long_range+0x6a/0xb0 [zfs]
[104669.746355] zfs_rmnode+0x24f/0x320 [zfs]
[104669.748447] ? zfs_zinactive+0xd3/0xf0 [zfs]
[104669.750543] zfs_inactive+0x7a/0x1f0 [zfs]
[104669.752565] ? unmap_mapping_range+0x73/0x140
[104669.754637] zpl_evict_inode+0x3c/0x50 [zfs]
[104669.756622] evict+0xc4/0x190
[104669.758581] do_unlinkat+0x1ab/0x320
[104669.760537] do_syscall_64+0x74/0x190
[104669.762487] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[104671.523356] redis-server D 0 15279 4325 0x00000100
[104671.525335] Call Trace:
[104671.527306] ? __schedule+0x24b/0x8c0
[104671.529294] ? preempt_count_add+0x68/0xa0
[104671.531289] schedule+0x32/0x90
[104671.533269] cv_wait_common+0x113/0x130 [spl]
[104671.535262] ? wait_woken+0x80/0x80
[104671.537313] txg_wait_open+0xab/0xf0 [zfs]
[104671.539359] dmu_free_long_range_impl+0x223/0x410 [zfs]
[104671.541418] dmu_free_long_range+0x6a/0xb0 [zfs]
[104671.543483] zfs_rmnode+0x24f/0x320 [zfs]
[104671.545519] ? zfs_zinactive+0xd3/0xf0 [zfs]
[104671.547563] zfs_inactive+0x7a/0x1f0 [zfs]
[104671.549556] ? unmap_mapping_range+0x73/0x140
[104671.551592] zpl_evict_inode+0x3c/0x50 [zfs]
[104671.553835] evict+0xc4/0x190
[104671.555806] __dentry_kill+0xc3/0x150
[104671.557788] dput.part.4+0x131/0x1e0
[104671.559773] SyS_renameat2+0x3bc/0x590
[104671.561739] do_syscall_64+0x74/0x190
[104671.563719] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[104671.873790] htop D 0 15971 15826 0x00000004
[104671.875834] Call Trace:
[104671.877876] ? __schedule+0x24b/0x8c0
[104671.879904] ? preempt_count_add+0x68/0xa0
[104671.881958] schedule+0x32/0x90
[104671.884022] cv_wait_common+0x113/0x130 [spl]
[104671.886124] ? wait_woken+0x80/0x80
[104671.888278] txg_wait_open+0xab/0xf0 [zfs]
[104671.890420] dmu_free_long_range_impl+0x223/0x410 [zfs]
[104671.892575] dmu_free_long_range+0x6a/0xb0 [zfs]
[104671.894737] zfs_trunc+0x77/0x1e0 [zfs]
[104671.897121] ? sa_lookup_locked+0x89/0xb0 [zfs]
[104671.899190] ? preempt_count_add+0x68/0xa0
[104671.901282] ? _raw_spin_lock+0x13/0x40
[104671.903422] zfs_freesp+0x9b/0x430 [zfs]
[104671.905508] ? preempt_count_add+0x68/0xa0
[104671.907588] ? _raw_spin_lock+0x13/0x40
[104671.909673] ? _raw_spin_unlock+0x16/0x30
[104671.911806] ? zfs_zaccess_aces_check+0x1b1/0x390 [zfs]
[104671.913970] ? zfs_zaccess+0x10b/0x3f0 [zfs]
[104671.916126] zfs_setattr+0xd1d/0x22a0 [zfs]
[104671.918234] ? spl_kmem_alloc_impl+0xd6/0x190 [spl]
[104671.920332] ? spl_kmem_alloc_impl+0xd6/0x190 [spl]
[104671.922472] ? rrw_exit+0x55/0x140 [zfs]
[104671.924577] ? zpl_xattr_get_dir+0x5d/0x110 [zfs]
[104671.926614] ? preempt_count_add+0x68/0xa0
[104671.928626] ? __kmalloc_node+0x21b/0x2d0
[104671.930649] ? spl_kmem_zalloc+0xc7/0x180 [spl]
[104671.932731] zpl_setattr+0xf8/0x170 [zfs]
[104671.934762] notify_change+0x294/0x420
[104671.936746] do_truncate+0x75/0xc0
[104671.939022] path_openat+0xc3f/0x1320
[104671.940986] do_filp_open+0x9b/0x110
[104671.942959] ? __vfs_write+0x36/0x180
[104671.944938] ? __check_object_size+0x9b/0x1a0
[104671.946918] ? __alloc_fd+0xaf/0x160
[104671.948907] ? do_sys_open+0x1bd/0x250
[104671.950879] do_sys_open+0x1bd/0x250
[104671.952854] do_syscall_64+0x74/0x190
[104671.954807] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
@lnicola That is expected. Once the GPF happens, that thread will stop running, causing any threads that rely on it to also become stuck. Unfortunately, we need to find a way to reproduce the issue at least somewhat consistently if we are going to have a chance to fix it.
Well, it seems I can still reproduce this, so tell me if there's anything I can do...
@lnicola can you provide steps that I can use to reproduce this? As of yet I have not been able to do so.
Ah, sorry. I can reproduce it on my pool, but it's a very old one. :disappointed:
I have similar error messages in my kernel log (without doing send/recv). They seem to happen with a large number of concurrent filesystem writes (eg make -j on kernel). kern.log
This is then followed by a large number of repeats of the list_del error about ->next being LIST_POISON1 so I trimmed those out.
@behlendorf I looked into this a bit 2 weeks ago and couldn't figure it out quickly. Do you or anyone else have a few cycles to look into this?
I'm running git head (commit 99f505a4d7439) on 4.16-rc7. Seen this error happen with the 4.15.y kernels as well.
So this might be dumb as I'm not familiar with the code, but looking in dmu_objset.c, os->os_dirty_dnodes[txgoff] and os->os_synced_dnodes[txgoff] both appear to use dnode_t's dn_dirty_link[txgoff] for the list pointers. Won't this be a problem when the dnode is removed from os_dirty_dnodes and added into os_synced_dnodes in dmu_objset_sync_dnodes?
I configured with --enable-debug, and got a PANIC with this VERIFY in dnode.c:dnode_add_ref failing. VERIFY(1 < refcount_add(&dn->dn_holds, tag));
FWIW, splitting up the links so os_dirty_dnodes and os_synced_dnodes use different links seems to stop the list corruption at least
Won't this be a problem when the dnode is removed from os_dirty_dnodes and added into os_synced_dnodes in dmu_objset_sync_dnodes?
It shouldn't be a problem as long as the dnode doesn't need to be on both lists at once.
I configured with --enable-debug, and got a PANIC with this VERIFY in dnode.c:dnode_add_ref failing.
This is probably a big indication of the root cause. It seems we are prematurely dropping a reference somewhere. Is this dataset using large dnodes (dnodesize != legacy
)? Also can you provide the full stack trace that goes along with that VERIFY panic you got?
Hi, so the hack to split the pointers didn't actually fix it. I managed to pass a couple of kernel compiles, but it just happened again.
The PANIC actually shows up after the list_del corruption though
Here's the log
It should be large dnodes, I'll check once I reboot.
Looking at that function not sure how that VERIFY fails though.. it checks that ref count is not zero, then adds 1 and checks that it's greater than 1, and that fails.
Yes large_dnodes is active
In dnode_move, don't we have to worry about dn_dirty_link in addition to dn_link?
@nivedita76 If you are able to reproduce the problem consistently, would it be possible to provide a stack trace of the VERIFY()
you are hitting?
How should I do that? The last log I attached was what got dumped into dmesg
Sorry. I missed it (needed to scroll down a bit farther). What you have provided is fine.
Might have an idea of what's causing this..... I'll get back soon if I figure anything out.
I just had a list_del corruption even after configuring zfs_multilist_num_sublists to 1. Unfortunately this time it didn't save anything to the log before crashing.
@nivedita76 You can try netconsole
to catch crash dumps if you have another machine:
modprobe netconsole
dmesg -n 8
cd /sys/kernel/config/netconsole
mkdir -p target1
cd target1
echo XXXX > local_ip
echo XXXX > remote_ip
echo enXXXX > dev_name
echo XX:XX:XX:XX:XX:XX > remote_mac
echo 1 > enabled
@nivedita76 Try applying this patch:
diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
index 596983b..17be48e 100644
--- a/module/zfs/dnode.c
+++ b/module/zfs/dnode.c
@@ -1100,6 +1100,11 @@ dnode_check_slots_free(dnode_children_t *children, int idx, int slots)
if (type != DMU_OT_NONE)
return (B_FALSE);
+ for (int i = 0; i < TXG_SIZE; i++) {
+ if (list_link_active(&dn->dn_dirty_link[i]))
+ return (B_FALSE);
+ }
+
continue;
} else {
return (B_FALSE);
This needs some work before I merge it, but in our testing it seemed to fix the problem.
@nivedita76 if you get a chance try the patch from #7388 and see if the problem is fixed.
Got an oops.
[ 62.868131] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 62.868522] IP: multilist_sublist_lock_obj+0x9/0x20 [ 62.868868] PGD 0 P4D 0 [ 62.869210] Oops: 0000 [#1] SMP PTI [ 62.869579] Modules linked in: macvtap macvlan tap bridge devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables nvidia_uvm(OE) 8021q garp mrp stp llc dm_mod snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel vboxpci(OE) aes_x86_64 crypto_simd glue_helper cryptd vboxnetadp(OE) intel_cstate snd_usb_audio vboxnetflt(OE) vboxdrv(OE) intel_uncore snd_usbmidi_lib snd_rawmidi intel_rapl_perf snd_seq_device nvidia_drm(OE) nvidia_modeset(OE) snd_hda_codec_realtek snd_hda_codec_generic nvidia(OE) snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore [ 62.872350] CPU: 55 PID: 7977 Comm: mysqld Tainted: G W OE 4.16.0-rani-zfs+ #7 [ 62.872806] Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 3407 03/10/2017 [ 62.873275] RIP: 0010:multilist_sublist_lock_obj+0x9/0x20 [ 62.873748] RSP: 0018:ffffbec4f0123828 EFLAGS: 00010246 [ 62.874222] RAX: ffffa1ae31c74000 RBX: 0000000000000000 RCX: 0000000000000000 [ 62.874706] RDX: ffffffff903a25c0 RSI: ffffa1ae07b47638 RDI: 0000000000000000 [ 62.875194] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 [ 62.875682] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa1ae07b47798 [ 62.876170] R13: ffffa1adf6edf0f8 R14: 0000000000000000 R15: 0000000000000002 [ 62.876659] FS: 00007f67fcdbbe00(0000) GS:ffffa1ae3e800000(0000) knlGS:0000000000000000 [ 62.877158] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 62.877656] CR2: 0000000000000018 CR3: 0000000ff68e0003 CR4: 00000000003626e0 [ 62.878161] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 62.878657] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 62.879185] Call Trace: [ 62.879770] dnode_check_slots_free+0x107/0x1c0 [ 62.880397] dnode_hold_impl+0x5d5/0xf30 [ 62.880887] dmu_object_alloc_dnsize+0x241/0x520 [ 62.881385] zfs_mknode+0x130/0xf70 [ 62.881887] ? _raw_spin_unlock+0x24/0x30 [ 62.882384] ? dmu_tx_assign+0x230/0x560 [ 62.882885] zfs_create+0x6b9/0x9e0 [ 62.883388] zpl_create+0xaa/0x160 [ 62.883917] lookup_open+0x5da/0x780 [ 62.884446] path_openat+0x32c/0xd10 [ 62.884944] do_filp_open+0xb4/0x130 [ 62.885449] ? do_sys_open+0x1c7/0x250 [ 62.885938] do_sys_open+0x1c7/0x250 [ 62.886412] do_syscall_64+0x66/0x120 [ 62.886881] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
System information
Describe the problem you're observing
Describe how to reproduce the problem
Include any warning/errors/backtraces from the system logs