Open segdy opened 3 years ago
That does seem like a different issue or issues, yes. Perhaps go open a separate issue?
I had another lockup, so I opened the issue: https://github.com/openzfs/zfs/issues/12270
I'm not able to make a reproducer either. I've had a little more luck in the past days and the servers have not locked up. What I did is reducing the load on the disks. I noticed that I'm also having occasional unaligned write errors on all disks. This is something rare. I wonder why there would be an unaligned write?
[569917.626018] ata3.00: exception Emask 0x0 SAct 0x2004 SErr 0x0 action 0x6 frozen
[569917.626499] ata3.00: failed command: WRITE FPDMA QUEUED
[569917.626956] ata3.00: cmd 61/08:10:28:62:42/00:00:4d:00:00/40 tag 2 ncq dma 4096 out
res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[569917.627843] ata3.00: status: { DRDY }
[569917.628259] ata3.00: failed command: WRITE FPDMA QUEUED
[569917.628681] ata3.00: cmd 61/20:68:c8:27:7f/03:00:4b:00:00/40 tag 13 ncq dma 409600 out
res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[569917.629536] ata3.00: status: { DRDY }
[569917.630037] ata3: hard resetting link
[569922.981975] ata3: link is slow to respond, please be patient (ready=0)
[569927.677972] ata3: COMRESET failed (errno=-16)
[569927.678480] ata3: hard resetting link
[569933.021961] ata3: link is slow to respond, please be patient (ready=0)
[569937.713962] ata3: COMRESET failed (errno=-16)
[569937.714469] ata3: hard resetting link
[569943.065958] ata3: link is slow to respond, please be patient (ready=0)
[569972.761922] ata3: COMRESET failed (errno=-16)
[569972.762442] ata3: limiting SATA link speed to 1.5 Gbps
[569972.762945] ata3: hard resetting link
[569976.997916] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[569977.010111] ata3.00: configured for UDMA/133
[569977.015763] sd 2:0:0:0: [sdc] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[569977.021911] sd 2:0:0:0: [sdc] tag#2 Sense Key : Illegal Request [current]
[569977.027196] sd 2:0:0:0: [sdc] tag#2 Add. Sense: Unaligned write command
[569977.031556] sd 2:0:0:0: [sdc] tag#2 CDB: Write(10) 2a 00 4d 42 62 28 00 00 08 00
[569977.035592] blk_update_request: I/O error, dev sdc, sector 1296196136 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
[569977.040077] zio pool=rpool vdev=/dev/disk/by-id/ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 error=5 type=2 offset=663651373056 size=4096 flags=180880
[569977.048757] sd 2:0:0:0: [sdc] tag#13 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[569977.052924] sd 2:0:0:0: [sdc] tag#13 Sense Key : Illegal Request [current]
[569977.057262] sd 2:0:0:0: [sdc] tag#13 Add. Sense: Unaligned write command
[569977.062000] sd 2:0:0:0: [sdc] tag#13 CDB: Write(10) 2a 00 4b 7f 27 c8 00 03 20 00
[569977.066895] blk_update_request: I/O error, dev sdc, sector 1266624456 op 0x1:(WRITE) flags 0x700 phys_seg 100 prio class 0
[569977.071052] zio pool=rpool vdev=/dev/disk/by-id/ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 error=5 type=2 offset=648510672896 size=409600 flags=40080c80
[569977.080280] ata3: EH complete
This is what is left in syslog:
/var/log/syslog:Jun 30 00:53:05 serverX zed: eid=36754 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510742528 priority=3 err=5 flags=0x380880 bookmark=9336:2973431:0:0
/var/log/syslog:Jun 30 00:53:05 serverX zed: eid=36755 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510738432 priority=3 err=5 flags=0x380880 bookmark=9336:2973429:0:0
/var/log/syslog:Jun 30 00:53:05 serverX zed: eid=36756 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510734336 priority=3 err=5 flags=0x380880 bookmark=9336:2973408:0:0
/var/log/syslog:Jun 30 00:53:06 serverX zed: eid=36757 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510717952 priority=3 err=5 flags=0x380880 bookmark=9336:2973396:0:0
/var/log/syslog:Jun 30 00:53:06 serverX zed: eid=36758 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510877696 priority=3 err=5 flags=0x380880 bookmark=9336:2973589:0:0
/var/log/syslog:Jun 30 00:53:06 serverX zed: eid=36759 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510705664 priority=3 err=5 flags=0x380880 bookmark=9336:2973380:0:0
/var/log/syslog:Jun 30 00:53:06 serverX zed: eid=36760 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510754816 priority=3 err=5 flags=0x380880 bookmark=9336:2973437:0:0
/var/log/syslog:Jun 30 00:53:06 serverX zed: eid=36761 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510685184 priority=3 err=5 flags=0x380880 bookmark=9336:2973340:0:0
/var/log/syslog:Jun 30 00:53:07 serverX zed: eid=36762 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510697472 priority=3 err=5 flags=0x380880 bookmark=9336:2973367:0:0
/var/log/syslog:Jun 30 00:53:07 serverX zed: eid=36763 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510840832 priority=3 err=5 flags=0x380880 bookmark=9336:2973547:0:0
/var/log/syslog:Jun 30 00:53:07 serverX zed: eid=36764 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510730240 priority=3 err=5 flags=0x380880 bookmark=9336:2973407:0:0
/var/log/syslog:Jun 30 00:53:07 serverX zed: eid=36765 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510689280 priority=3 err=5 flags=0x380880 bookmark=9336:2973342:0:0
/var/log/syslog:Jun 30 00:53:07 serverX zed: eid=36766 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510681088 priority=3 err=5 flags=0x380880 bookmark=9336:2973337:0:0
/var/log/syslog:Jun 30 00:53:08 serverX zed: eid=36767 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510812160 priority=3 err=5 flags=0x380880 bookmark=9336:2973523:0:0
/var/log/syslog:Jun 30 00:53:08 serverX zed: eid=36768 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510693376 priority=3 err=5 flags=0x380880 bookmark=9336:2973357:0:0
/var/log/syslog:Jun 30 00:53:08 serverX zed: eid=36769 class=io pool='rpool' vdev=ata-HGST_HUS724020ALA640_XSERIALNUMBERX-part1 size=4096 offset=648510672896 priority=3 err=5 flags=0x380880 bookmark=9336:2973327:0:0
The time of the syslog is during backup receive from other servers.
Offtopic for the bug, but I have seen drives report Unaligned write command
sense codes in a number of random circumstances where it really doesn't seem to be the case that that's the actual problem. #10094 had cases where SATA link power management or the SATA controller were the problem, for example; #8552 was an SSD firmware bug, and I've personally seen that code come out when a (spinning) drive was failing.
My bet if this just started would be that you did a kernel update recently. But if you want to debug it further with others, you're better off asking the mailing list or IRC or s/t.
I have the same feeling as you do. I've since reduced the load on the disks, reduced the sata link speed to 1.5Gbps and removed as much encryption as I could (i.e. most). I'm seeing less on these errors. I updated kernels recently (twice in fact in the few weeks). let's see if this bug stales itself :)
I experienced another lockup and facing the choice, between losing my job and removing encryption, I chose the latter. So now I'm running with unencrypted zfs everywhere. Removing encryption made the datasets snappier (expected), but more importantly reduced disk iops. This was unexpected and might be due to the defragmentation that takes place when a set is fully replicated on a clean pool
Incidentally, the proxmox dev team, has a similar issue: https://bugzilla.proxmox.com/show_bug.cgi?id=3206 and a proposed solution from openzfs 2.0.5
Let's see.
That link was useful, but the patch is not in 2.0.5, or indeed anywhere in OpenZFS git that I can see.
It also kind of feels like a workaround to me, though that's certainly better than nothing if it helps and has no negative effects.
(The patch in question, I believe, so nobody else has to go digging like me.)
edit: I guess I should say, I would like to know what makes this specific case unsafe and not (apparently) the rest and fix that, I don't feel comfortable enough with this code to know whether this is the least invasive solution.
edit 2: reading the code as it is now, unless I'm overlooking an interaction, this patch really feels like it does the opposite of what it claims...
@rincebrain if the patch you are mentioning is related to https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1900889 then it is not helping (in my case atlest): https://bugzilla.proxmox.com/show_bug.cgi?id=3206#c21
It is, yes.
How unfortunate. Back to my debugging. (I have a patch that works around this, but it just hard panics somewhere else some minutes later right now.)
Has anyone checked if 958826b improves the situation? I've rebased it onto 2.0.5 in #12346, but I'm waiting for a review of the cherry pick (that particular patch is pretty big). There's another patch in that MR that might help.
I'm personally experiencing a different symptom, but that MR might still be useful.
It's on my TODO for next time I rebuild, but it hasn't panicked again since I noticed that commit go in, so "next time" hasn't come.
Soon(tm).
Still panics identically for me on 958826b, sadly.
While I'm thinking about it, https://www.illumos.org/issues/14003 looks related to me...
FWIW, I put prints in https://github.com/openzfs/zfs/blob/8ae86e2edc736483c1530fd689525aa2460eaec8/module/zfs/arc.c#L5832 before arc_buf_destroy and they didn't trigger before it caught fire using the repro I do in #12070, so I don't know that that codepath is to blame?
Or did you mean something else?
Initially I thought there may be a difference but it seems to me the same: even if the arc_buf_destroy()
is not called at https://github.com/openzfs/zfs/blob/8ae86e2edc736483c1530fd689525aa2460eaec8/module/zfs/arc.c#L5840
the dbuf_issue_final_prefetch_done()
will do so at https://github.com/openzfs/zfs/blob/8ae86e2edc736483c1530fd689525aa2460eaec8/module/zfs/arc.c#L5844 (if this is related to https://www.illumos.org/issues/14003).
Anyhow, it seems that the references accounting of ARC buffers is leaky somewhere, in the case of encryption.
A quick update from me. Disabling encryption and not abusing the disks too much seems to have fixed my issues.
I set up a dedicated server, so I could try to replicate the issue. This is a 2x2 mirrors zfs pool on 4x2TB sata disks ("enterprise"). If I run the following:
After some hours (sometimes a day will pass) I start getting the unaligned-write errors on the hard disks and disks will be dropped from the pool.
Just a quick update from my side. I have swapped out hardware from troublesome Shuttle Barebone DS57U with Intel Celeron 3205U to the newer Shuttle Barebone XPC slim DS10U5 with Intel Core i5-8265U. While the previous CPU did not list aesni
support in /sys/module/icp/parameters/icp_aes_impl
, the i5-8265U now does:
$ cat /sys/module/icp/parameters/icp_aes_impl
cycle [fastest] generic x86_64 aesni
$ cat /sys/module/icp/parameters/icp_gcm_impl
cycle [fastest] avx generic pclmulqdq
$ zfs get encryption dpool
NAME PROPERTY VALUE SOURCE
dpool encryption aes-256-gcm -
(for detailed system information, see my duplicate issue #12445 and my blog post: Secure External Backup with ZFS Native Encryption)
Since running on this new CPU, the problem never occurred again. So it really seems to be related to generic
implementation.
I don't think it is, at this point - it seems to be a problem with the ARC and refcounting in an edge case, so it's possible that with your new setup you're just not hitting the edge case.
Same here on zfs recv
:
Sep 18 01:59:19 paradies kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Sep 18 01:59:19 paradies kernel: #PF: supervisor read access in kernel mode
Sep 18 01:59:19 paradies kernel: #PF: error_code(0x0000) - not-present page
Sep 18 01:59:19 paradies kernel: PGD 0 P4D 0
Sep 18 01:59:19 paradies kernel: Oops: 0000 [#1] SMP PTI
Sep 18 01:59:19 paradies kernel: CPU: 16 PID: 255531 Comm: receive_writer Tainted: P O 5.10.57 #1-NixOS
Sep 18 01:59:19 paradies kernel: Hardware name: IBM System x3650 M3 -[7945H2G]-/69Y4438, BIOS -[D6E157AUS-1.15]- 06/13/2012Sep 18 01:59:19 paradies kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Sep 18 01:59:19 paradies kernel: #PF: supervisor read access in kernel mode
Sep 18 01:59:19 paradies kernel: #PF: error_code(0x0000) - not-present page
Sep 18 01:59:19 paradies kernel: PGD 0 P4D 0
Sep 18 01:59:19 paradies kernel: Oops: 0000 [#1] SMP PTI
Sep 18 01:59:19 paradies kernel: CPU: 16 PID: 255531 Comm: receive_writer Tainted: P O 5.10.57 #1-NixOS
Sep 18 01:59:19 paradies kernel: Hardware name: IBM System x3650 M3 -[7945H2G]-/69Y4438, BIOS -[D6E157AUS-1.15]- 06/13/2012
Sep 18 01:59:19 paradies kernel: RIP: 0010:abd_verify+0x5/0x60 [zfs]
Sep 18 01:59:19 paradies kernel: Code: 0f 1f 44 00 00 66 66 66 66 90 8b 07 c1 e8 05 83 e0 01 c3 66 90 66 66 66 66 90 8b 07 c1 e8 06 83 e0 01 c3 66 90 66 66 66 66 90 <8b> 07 a8 01 74 01 c3 41 54 55 48 89 fd 53 a8 40 74 3c 48 8b 47 68
Sep 18 01:59:19 paradies kernel: RSP: 0018:ffffb7e4f627b9c8 EFLAGS: 00010246
Sep 18 01:59:19 paradies kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Sep 18 01:59:19 paradies kernel: RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
Sep 18 01:59:19 paradies kernel: RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000000
Sep 18 01:59:19 paradies kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000004000
Sep 18 01:59:19 paradies kernel: R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
Sep 18 01:59:19 paradies kernel: FS: 0000000000000000(0000) GS:ffff9e773fc80000(0000) knlGS:0000000000000000
Sep 18 01:59:19 paradies kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 18 01:59:19 paradies kernel: CR2: 0000000000000000 CR3: 0000000274978004 CR4: 00000000000206e0
Sep 18 01:59:19 paradies kernel: Call Trace:
Sep 18 01:59:19 paradies kernel: abd_borrow_buf+0x12/0x40 [zfs]
Sep 18 01:59:19 paradies kernel: abd_borrow_buf_copy+0x29/0x80 [zfs]
Sep 18 01:59:19 paradies kernel: zio_crypt_copy_dnode_bonus+0x2e/0x120 [zfs]
Sep 18 01:59:19 paradies kernel: arc_buf_fill+0x3f9/0xcf0 [zfs]
Sep 18 01:59:19 paradies kernel: arc_untransform+0x1d/0x80 [zfs]
Sep 18 01:59:19 paradies kernel: dbuf_read_verify_dnode_crypt+0xee/0x160 [zfs]
Sep 18 01:59:19 paradies kernel: ? __cv_init+0x3d/0x60 [spl]
Sep 18 01:59:19 paradies kernel: dbuf_read_impl.constprop.0+0x2ac/0x6b0 [zfs]
Sep 18 01:59:19 paradies kernel: ? dbuf_create+0x432/0x5d0 [zfs]
Sep 18 01:59:19 paradies kernel: ? dbuf_find+0x1a1/0x200 [zfs]
Sep 18 01:59:19 paradies kernel: dbuf_read+0xe2/0x570 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_tx_check_ioerr+0x64/0xd0 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_tx_hold_free_impl+0x12f/0x240 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_free_long_range+0x23e/0x4c0 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_free_long_object+0x22/0xc0 [zfs]
Sep 18 01:59:19 paradies kernel: receive_freeobjects+0x82/0xf0 [zfs]
Sep 18 01:59:19 paradies kernel: receive_writer_thread+0x56f/0x9d0 [zfs]
Sep 18 01:59:19 paradies kernel: ? set_user_nice.part.0+0x141/0x240
Sep 18 01:59:19 paradies kernel: ? redact_check.isra.0+0x1d0/0x1d0 [zfs]
Sep 18 01:59:19 paradies kernel: ? thread_generic_wrapper+0x6f/0x80 [spl]
Sep 18 01:59:19 paradies kernel: thread_generic_wrapper+0x6f/0x80 [spl]
Sep 18 01:59:19 paradies kernel: ? __thread_exit+0x20/0x20 [spl]
Sep 18 01:59:19 paradies kernel: kthread+0x11b/0x140
Sep 18 01:59:19 paradies kernel: ? __kthread_bind_mask+0x60/0x60
Sep 18 01:59:19 paradies kernel: ret_from_fork+0x22/0x30
Sep 18 01:59:19 paradies kernel: Modules linked in: af_packet ch st iTCO_wdt intel_pmc_bxt watchdog ip6table_nat iptable_nat nf_nat xt_conntrack nf_conntrack intel_powerclamp nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c coretemp ip6t_rpfilter intel_cstate ipt_rpfilter intel_uncore ip6table_raw iptable_raw xt_pkttype nf_log_ipv6 input_leds led_class mousedev joydev evdev nf_log_ipv4 uas mac_hid nf_log_common xt_LOG cdc_ether xt_tcpudp ata_generic lpc_ich i2c_i801 usbnet pata_acpi i2c_smbus mgag200 mii drm_kms_helper ip6table_filter i2c_algo_bit fb_sys_fops qla2xxx syscopyarea ip6_tables sysfillrect sysimgblt iptable_filter sch_fq_codel nvme_fc nvme_fabrics nvme_core e1000e scsi_transport_fc ptp ioatdma atkbd pps_core libps2 i7core_edac i5500_temp dca serio edac_core loop ipmi_ssif ses enclosure zfs(PO) bnx2 tpm_tis tpm_tis_core ipmi_si ipmi_devintf zunicode(PO) tpm tiny_power_button zzstd(O) rng_core ipmi_msghandler button acpi_cpufreq zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) bonding tun tap macvlan
Sep 18 01:59:19 paradies kernel: bridge stp llc kvm_intel kvm drm irqbypass agpgart backlight i2c_core sg fuse configfs pstore ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid usb_storage sr_mod cdrom sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common ata_piix libata uhci_hcd mptsas ehci_pci ehci_hcd scsi_transport_sas megaraid_sas mptscsih mptbase usbcore crc32c_intel scsi_mod usb_common rtc_cmos dm_mod
Sep 18 01:59:19 paradies kernel: CR2: 0000000000000000
Sep 18 01:59:19 paradies kernel: ---[ end trace cd8c11f85ded0f4c ]---
Sep 18 01:59:19 paradies kernel: RIP: 0010:abd_verify+0x5/0x60 [zfs]
Sep 18 01:59:19 paradies kernel: Code: 0f 1f 44 00 00 66 66 66 66 90 8b 07 c1 e8 05 83 e0 01 c3 66 90 66 66 66 66 90 8b 07 c1 e8 06 83 e0 01 c3 66 90 66 66 66 66 90 <8b> 07 a8 01 74 01 c3 41 54 55 48 89 fd 53 a8 40 74 3c 48 8b 47 68
Sep 18 01:59:19 paradies kernel: RSP: 0018:ffffb7e4f627b9c8 EFLAGS: 00010246
Sep 18 01:59:19 paradies kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Sep 18 01:59:19 paradies kernel: RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
Sep 18 01:59:19 paradies kernel: RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000000
Sep 18 01:59:19 paradies kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000004000
Sep 18 01:59:19 paradies kernel: R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
Sep 18 01:59:19 paradies kernel: FS: 0000000000000000(0000) GS:ffff9e773fc80000(0000) knlGS:0000000000000000
Sep 18 01:59:19 paradies kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 18 01:59:19 paradies kernel: CR2: 0000000000000000 CR3: 0000000274978004 CR4: 00000000000206e0
Sep 18 01:59:19 paradies kernel: RIP: 0010:abd_verify+0x5/0x60 [zfs]
Sep 18 01:59:19 paradies kernel: Code: 0f 1f 44 00 00 66 66 66 66 90 8b 07 c1 e8 05 83 e0 01 c3 66 90 66 66 66 66 90 8b 07 c1 e8 06 83 e0 01 c3 66 90 66 66 66 66 90 <8b> 07 a8 01 74 01 c3 41 54 55 48 89 fd 53 a8 40 74 3c 48 8b 47 68
Sep 18 01:59:19 paradies kernel: RSP: 0018:ffffb7e4f627b9c8 EFLAGS: 00010246
Sep 18 01:59:19 paradies kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Sep 18 01:59:19 paradies kernel: RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
Sep 18 01:59:19 paradies kernel: RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000000
Sep 18 01:59:19 paradies kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000004000
Sep 18 01:59:19 paradies kernel: R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
Sep 18 01:59:19 paradies kernel: FS: 0000000000000000(0000) GS:ffff9e773fc80000(0000) knlGS:0000000000000000
Sep 18 01:59:19 paradies kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 18 01:59:19 paradies kernel: CR2: 0000000000000000 CR3: 0000000274978004 CR4: 00000000000206e0
Sep 18 01:59:19 paradies kernel: Call Trace:
Sep 18 01:59:19 paradies kernel: abd_borrow_buf+0x12/0x40 [zfs]
Sep 18 01:59:19 paradies kernel: abd_borrow_buf_copy+0x29/0x80 [zfs]
Sep 18 01:59:19 paradies kernel: zio_crypt_copy_dnode_bonus+0x2e/0x120 [zfs]
Sep 18 01:59:19 paradies kernel: arc_buf_fill+0x3f9/0xcf0 [zfs]
Sep 18 01:59:19 paradies kernel: arc_untransform+0x1d/0x80 [zfs]
Sep 18 01:59:19 paradies kernel: dbuf_read_verify_dnode_crypt+0xee/0x160 [zfs]
Sep 18 01:59:19 paradies kernel: ? __cv_init+0x3d/0x60 [spl]
Sep 18 01:59:19 paradies kernel: dbuf_read_impl.constprop.0+0x2ac/0x6b0 [zfs]
Sep 18 01:59:19 paradies kernel: ? dbuf_create+0x432/0x5d0 [zfs]
Sep 18 01:59:19 paradies kernel: ? dbuf_find+0x1a1/0x200 [zfs]
Sep 18 01:59:19 paradies kernel: dbuf_read+0xe2/0x570 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_tx_check_ioerr+0x64/0xd0 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_tx_hold_free_impl+0x12f/0x240 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_free_long_range+0x23e/0x4c0 [zfs]
Sep 18 01:59:19 paradies kernel: dmu_free_long_object+0x22/0xc0 [zfs]
Sep 18 01:59:19 paradies kernel: receive_freeobjects+0x82/0xf0 [zfs]
Sep 18 01:59:19 paradies kernel: receive_writer_thread+0x56f/0x9d0 [zfs]
Sep 18 01:59:19 paradies kernel: ? set_user_nice.part.0+0x141/0x240
Sep 18 01:59:19 paradies kernel: ? redact_check.isra.0+0x1d0/0x1d0 [zfs]
Sep 18 01:59:19 paradies kernel: ? thread_generic_wrapper+0x6f/0x80 [spl]
Sep 18 01:59:19 paradies kernel: thread_generic_wrapper+0x6f/0x80 [spl]
Sep 18 01:59:19 paradies kernel: ? __thread_exit+0x20/0x20 [spl]
Sep 18 01:59:19 paradies kernel: kthread+0x11b/0x140
Sep 18 01:59:19 paradies kernel: ? __kthread_bind_mask+0x60/0x60
Sep 18 01:59:19 paradies kernel: ret_from_fork+0x22/0x30
Sep 18 01:59:19 paradies kernel: Modules linked in: af_packet ch st iTCO_wdt intel_pmc_bxt watchdog ip6table_nat iptable_nat nf_nat xt_conntrack nf_conntrack intel_powerclamp nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c coretemp ip6t_rpfilter intel_cstate ipt_rpfilter intel_uncore ip6table_raw iptable_raw xt_pkttype nf_log_ipv6 input_leds led_class mousedev joydev evdev nf_log_ipv4 uas mac_hid nf_log_common xt_LOG cdc_ether xt_tcpudp ata_generic lpc_ich i2c_i801 usbnet pata_acpi i2c_smbus mgag200 mii drm_kms_helper ip6table_filter i2c_algo_bit fb_sys_fops qla2xxx syscopyarea ip6_tables sysfillrect sysimgblt iptable_filter sch_fq_codel nvme_fc nvme_fabrics nvme_core e1000e scsi_transport_fc ptp ioatdma atkbd pps_core libps2 i7core_edac i5500_temp dca serio edac_core loop ipmi_ssif ses enclosure zfs(PO) bnx2 tpm_tis tpm_tis_core ipmi_si ipmi_devintf zunicode(PO) tpm tiny_power_button zzstd(O) rng_core ipmi_msghandler button acpi_cpufreq zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) bonding tun tap macvlan
Sep 18 01:59:19 paradies kernel: bridge stp llc kvm_intel kvm drm irqbypass agpgart backlight i2c_core sg fuse configfs pstore ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid usb_storage sr_mod cdrom sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common ata_piix libata uhci_hcd mptsas ehci_pci ehci_hcd scsi_transport_sas megaraid_sas mptscsih mptbase usbcore crc32c_intel scsi_mod usb_common rtc_cmos dm_mod
Sep 18 01:59:19 paradies kernel: CR2: 0000000000000000
Sep 18 01:59:19 paradies kernel: ---[ end trace cd8c11f85ded0f4c ]---
Sep 18 01:59:19 paradies kernel: RIP: 0010:abd_verify+0x5/0x60 [zfs]
Sep 18 01:59:19 paradies kernel: Code: 0f 1f 44 00 00 66 66 66 66 90 8b 07 c1 e8 05 83 e0 01 c3 66 90 66 66 66 66 90 8b 07 c1 e8 06 83 e0 01 c3 66 90 66 66 66 66 90 <8b> 07 a8 01 74 01 c3 41 54 55 48 89 fd 53 a8 40 74 3c 48 8b 47 68
Sep 18 01:59:19 paradies kernel: RSP: 0018:ffffb7e4f627b9c8 EFLAGS: 00010246
Sep 18 01:59:19 paradies kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Sep 18 01:59:19 paradies kernel: RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000000000000000
Sep 18 01:59:19 paradies kernel: RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000000
Sep 18 01:59:19 paradies kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000004000
Sep 18 01:59:19 paradies kernel: R13: 0000000000004000 R14: 0000000000000000 R15: 0000000000000020
Sep 18 01:59:19 paradies kernel: FS: 0000000000000000(0000) GS:ffff9e773fc80000(0000) knlGS:0000000000000000
Sep 18 01:59:19 paradies kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 18 01:59:19 paradies kernel: CR2: 0000000000000000 CR3: 0000000274978004 CR4: 00000000000206e0
Intel(R) Xeon(R) CPU L5640 @ 2.27GHz
/sys/module/icp/parameters/icp_aes_impl
is cycle [fastest] generic x86_64
/sys/module/icp/parameters/icp_gcm_impl
is cycle [fastest] generic
ssh -n remote zfs send -cL -I @snap1 remote/dateset@snap2 | zfs recv -F localpool/dataset
what to do?
My advice at this point is "don't use ZFS encryption until this is fixed".
Is anyone experiencing this issue also having issues with the "stability" of their hard disks? The one server that is giving me more troubles has somewhat old sata enterprise drives (i.e. rented server I cannot control 100%). The drives will occasionally have hiccups during intense usage (say scrub, huge io with seeks coming to zfs listing snapshots etc). On occasion they will delay the response and also give "unaligned write errors". I'm quite sure this is causing issues in ZFS driver, when the drive cannot honor the request within some time. On the same drives, I had 2 or 3 sectors that could not be read, but that would be read properly after being rewritten a few times (10-20 overwrites). When hitting such a sector I'd expect the drive to give up quickly (it's an enterprise HGST drive after all), but perhaps 1-2 seconds is enough?
My gut feeling is there's some internal "queue" of operations which might get full because the drive is not responding.
At least for the main issue this bug is about, the issue is that it screws up refcounting a buffer and frees it prematurely, then reuses it, and then someone who had a reference frees it again, and the person left with a reference has a bad time on access, I believe.
@pquan The motherboard (with the integrated Pentium J2900) died at the beginning of august. The disks were swapped to a new machine (Dell R420 / Xeon E5-2430L) and haven't had an issue ever since (before that it was a bi-daily / weekly issue). So in my case i guess it is not related to the disks' health / performance.
I think I see where the leak is:
Say https://github.com/openzfs/zfs/blob/139690d6c3c748b138525784dd6f0aa48cfcda1a/module/zfs/arc.c#L5742 is executed but returns an error. An add_reference()
is performed but when we reach https://github.com/openzfs/zfs/blob/139690d6c3c748b138525784dd6f0aa48cfcda1a/module/zfs/arc.c#L5845 that reference has not been cleared, presumably leading to the panic.
Could someone test (not in production) with:
(void) remove_reference(hdr, hash_lock, acb->acb_private)
right before
https://github.com/openzfs/zfs/blob/139690d6c3c748b138525784dd6f0aa48cfcda1a/module/zfs/arc.c#L5784
@pquan The motherboard (with the integrated Pentium J2900) died at the beginning of august. The disks were swapped to a new machine (Dell R420 / Xeon E5-2430L) and haven't had an issue ever since (before that it was a bi-daily / weekly issue). So in my case i guess it is not related to the disks' health / performance.
Thanks for reporting . You moved from a cpu with passmark of 1236 and no aes-ni to a cpu with a passmark of 5273 and aes-ni instruction. This is 400% increase and more so for encryption. Not exactly the same.
I am still sustaining that this is a "bottleneck" issue where the problem occurs under heavy load and/or resource starvation. I have no problems with any of my systems as long as I only have light/normal usage.
At the moment I'm limiting encryption only to servers with no other use but backup receiving. I make sure I do not scrub during the zfs recv. I also leave all ram for ZFS arc and do not limit the ram usage to 2-4GB (as I do normally on a multi-purpose server).
It's been OK for the last weeks.
@pquan I've ment the disk's health / disk's performance. (The system is dual socket, so the computing performance is even better than that.)
@cserem Ok, got it now. :)
I think I see where the leak is: Say
https://github.com/openzfs/zfs/blob/139690d6c3c748b138525784dd6f0aa48cfcda1a/module/zfs/arc.c#L5742
is executed but returns an error. An
add_reference()
is performed but when we reach https://github.com/openzfs/zfs/blob/139690d6c3c748b138525784dd6f0aa48cfcda1a/module/zfs/arc.c#L5845that reference has not been cleared, presumably leading to the panic. Could someone test (not in production) with:
(void) remove_reference(hdr, hash_lock, acb->acb_private)
right before https://github.com/openzfs/zfs/blob/139690d6c3c748b138525784dd6f0aa48cfcda1a/module/zfs/arc.c#L5784
Sadly, no, crashes the same way with that patch.
I would expect it to be a missing add_reference or spurious remove_reference, though, not a missing remove_reference, unless I very much am misunderstanding the failure mode.
A user had previously reported this same issue in Ubuntu at https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1900889 "While zfs send'ing from Bionic to Focal, my send/recv hung midway and I found this in the receiver's dmesg. The receiving side uses ZFS native encryption and had the key manually loaded before sending/receiving. The sending side is unencrypted."
It was possibly fixed in an Ubuntu-only patch here where we set arc_fill_flags_t flags = ARC_FILL_LOCKED at the start of arc_untransform: https://git.launchpad.net/ubuntu/+source/zfs-linux/tree/debian/patches/4701-enable-ARC-FILL-LOCKED-flag.patch?h=import/0.8.4-1ubuntu15
This patch was reported in the original bug as possibly resolving the issue (but not confirmed 100% as the cause was not 100% reliable, but it stopped re-occuring) however that same patch is also suspected to be causing random ZFS corruption when using encrypted datasets and likely to be reverted in https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476 (Same issue discussed in #10971).
So clearly this is not a correct fix generally but I thought it was interesting to note in case this gives some clues as to a possible cause for this as yet unresolved issue. Similar issues seem to be discussed in #12014. Also interestingly the code for this ARC_FILL_IN_PLACE handling was added to fix a similar sounding issue "Raw receive fix and encrypted objset security fix" in #7632 which first shipped in zfs 0.8.0.
Avoid any races by enabling ARC_FILL_LOCKED during in the
arc_buf_fill call chain
Author: unknown
Origin: ubuntu (LP :#1900889)
Forwarded: no
Last-Update: 2020-10-22
Index: zfs-linux-0.8.3/module/zfs/arc.c
===================================================================
--- zfs-linux-0.8.3.orig/module/zfs/arc.c
+++ zfs-linux-0.8.3/module/zfs/arc.c
@@ -2305,7 +2305,7 @@ arc_untransform(arc_buf_t *buf, spa_t *s
boolean_t in_place)
{
int ret;
- arc_fill_flags_t flags = 0;
+ arc_fill_flags_t flags = ARC_FILL_LOCKED;
if (in_place)
flags |= ARC_FILL_IN_PLACE;
I believe I tried that patch somewhere in this long cascade of debugging, and found that at least for my reproduction, it did not save me.
Now, I have very tentative reasons to wonder if this might be helpful, but anyone who's getting burned by this, could you try shoving this into a modprobe config file, e.g. /etc/modprobe.d/zfs.conf, regenerating initrd, and rebooting, and then seeing if this still happens to you with the same frequency, less, or more? (Note that if you're on not-x86, this value is intended to be PAGESIZE+1).
options zfs zfs_abd_scatter_min_size=4097
(Technically, you don't have to reboot to change the parameter, I just want to avoid the chance that any existing allocations might be below that threshold and confuse the data...so if you don't reboot and try this, please note that you did that in telling me this helped or not.)
edit: Rats, it still eventually reproduced on my testbed with this setting. Oh well, back to the mines.
So, for anyone feeling brave, I have a patch.
It's not perfect - I won't be opening a PR for this, because when I turn on all my debug tooling at once (which is not in this tree, there's a bunch of custom nonsense that will run one place ever and break everywhere else) and run it, it still errors out sometimes on raw receives...but the emphasis there is errors; it appears to just be on the mount step at the very end and retriable, not VERIFY failure, NULL dereference, or panic. (Curiously, I haven't been able to make it error on my testbeds without running the kitchen sink of sketchy bespoke additional tooling...)
It's also heavy-handed - just taking the mutex on each call is a big hammer, and I bet benchmarking this diff for performance would be sad indeed.
As to an explanation...
The reproducers I've got resulted in a situation where you ended up with one thread holding a reference to an arc_buf_t
doing dbuf_sync_leaf()
-> dbuf_write()
-> arc_write()
, which helpfully NULLs the b_pabd and b_rabd of the arc_buf_t
on completion, only for another thread to pass it in ~simultaneously to dbuf_read()
-> ... -> arc_untransform()
, and then at any of various points once you're inside dbuf_read()
you might notice that your buffer has no buffers and boom goes the dynamite.
I expect the correct fix is for somebody earlier in the pipeline to copy the dbuf instead of passing the singular copy around, but I haven't run down where that should happen yet.
I expect taking the mutex there to preclude both dbuf_write()
and dbuf_read()
sticking their fingers in at the same time, but if it still results in some failures with all the debug tooling on, I've still got more to dig into.
Nevertheless, assuming the CI doesn't spit out horrible warnings of death and doom, I thought some people might find this disruptive enough to want to try it - the linked patch is against 2.1.2, but I don't think the relevant code has churned much in years, so it should apply with some fuzz to most things.
(As for how confident I am about the explanation...
I added tooling to all the calls that alloc/free the ABDs and alloc/free/mutate arc_buf_hdr_ts, which captured the stacktraces each time one of those happened, and I've got some nice ones of the header that turns up with a NULL inappropriately in dbuf_read()
-> ... -> arc_untransform()
-> arc_buf_fill()
where the immediately prior thing was a thread doing dbuf_write()
-> arc_write()
, including arc_hdr_free_abd()
inside. So unless there's code I didn't notice to handle NULLs appearing at any point where they were not before inside dbuf_read()
, here we are.)
edit: ...lovely. After letting it run for over a day on one of my testbeds with all the debugging hooks enabled, it VERIFY failed again, and the stacktrace, if it is to be believed, involved unlocking the mutex on the way through.
While this is still certainly useful given how hard I seem to have to work to reproduce it now, I'm going to have to come up with a better solution...
e: I should update this a bit, since I dug into it - it seems like this is happening to (at least on every panic I hit) a dbuf with a dnode of DMU_META_DNODE_OBJECT, and that + DNODE in general, if I'm reading this right, gets special-cased in a bunch of places that would normally copy it to instead try to do inplace updates.
But I could not readily find a mutex I could take or add to preclude this freeing - when I finally thought I had one, locking in the dnode, I found that sometimes I was racing against something destroying the dnode, and then I decided to go try and figure out why this only happened with encrypted recv.
Bonus: In case anyone was wondering if you could repro this on FBSD...
KDB: stack backtrace:
#0 0xffffffff80c574c5 at kdb_backtrace+0x65
#1 0xffffffff80c09ea1 at vpanic+0x181
#2 0xffffffff8242c52a at spl_panic+0x3a
#3 0xffffffff8246dca8 at arc_buf_fill+0x1b98
#4 0xffffffff8246c0a7 at arc_untransform+0x157
#5 0xffffffff824937b6 at dbuf_read_verify_dnode_crypt+0xd6
#6 0xffffffff824929ee at dbuf_read+0x7ee
#7 0xffffffff824a6c96 at dmu_buf_hold+0x46
#8 0xffffffff825f0981 at zap_lockdir+0x31
#9 0xffffffff825f219c at zap_lookup_norm+0x3c
#10 0xffffffff825f2151 at zap_lookup+0x11
#11 0xffffffff8252913a at sa_setup+0x19a
#12 0xffffffff824419b0 at zfsvfs_init+0x560
#13 0xffffffff824413fd at zfsvfs_create_impl+0x12d
#14 0xffffffff824412bc at zfsvfs_create+0xbc
#15 0xffffffff8243fc2a at zfs_mount+0x32a
#16 0xffffffff80cda9b9 at vfs_domount+0x5e9
#17 0xffffffff80cd9bd7 at vfs_donmount+0x8e7
Uptime: 5h30m46s
Dumping 1547 out of 24542 MB:..2%..11%..21%..32%..41%..51%..61%..71%..81%..91%
__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55 /usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
(kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1 doadump (textdump=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:399
#2 0xffffffff80c09a96 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:486
#3 0xffffffff80c09f10 in vpanic (fmt=<optimized out>, ap=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:919
#4 0xffffffff8242c52a in spl_panic (file=<optimized out>,
func=<optimized out>, line=<unavailable>, fmt=<unavailable>)
at os/freebsd/spl/spl_misc.c:107
#5 0xffffffff8246dca8 in arc_hdr_decrypt (hdr=0xfffff8020892db60,
It's me again.
I've got another prospective patch to make anyone bitten by this's life nicer, #12943. So you can look into that if this is biting you, and kick me if it breaks all the windows in your house somehow. :P
e: One minor footnote. In my experimenting, it only ever panicked on trying to mount the dataset after/while doing the receive. So you might want to try receiving with -u into an unmounted dataset, if my PR doesn't help/you don't want to try it right now.
Hi, Our system was regularly crashing every two days so tried applying your patch to 2.0.6 and running with that. I'm afraid to report it didn't last much longer:
Jan 17 08:03:16 fs3 kernel: INFO: task txg_quiesce:8625 blocked for more than 120 seconds.
Jan 17 08:03:16 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 17 08:03:16 fs3 kernel: txg_quiesce D ffff93067a5e47e0 0 8625 2 0x00000000
Jan 17 08:03:16 fs3 kernel: Call Trace:
Jan 17 08:03:16 fs3 kernel: [
09:27:23 up 3 days, 23:36, 1 user, load average: 26.09, 25.35, 22.27
I still have a z_wr_iss process running, but it doesn't look like it's achieving much as the system load doesn't match the CPU usage of this process. I had already tried running is all my receive targets unmounted (as I was seeing some issues with 'busy datasets' when they were mounted (post your patch(.
I was going to say it's not panicked, but the backtrace there says it's trying to go into spl_panic - do you have an actual message from the panic? It'll be hard to refine the patch if I don't know what's gone wrong, and I've not caused it to fail in many runs at this point.
e: Also, can you tell me some things about the system - what's the CPU/RAM/IO throughput look like, what are the pool layout(s), what's the workload you do on it mostly like?
e2: Can you also show me the diff you ended up with against stock 2.0.6? I'd hate to go down a rabbit hole only to discover it was some complication present in 2.0.6 already fixed in master...
Hi, Another update before I answer your questions. I tried rebooting before seeing your message and as always the system hung. After forced reboot the pool imported as normal and I requested a scrub as further data errors were being listed. I came back to the system a few minutes ago to find it panic'd again:
[ 1827.039227] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
[ 1827.039280] PANIC at dmu_recv.c:1811:receive_object()
[ 1827.039308] Showing stack for process 60708
[ 1827.039313] CPU: 11 PID: 60708 Comm: receive_writer Kdump: loaded Tainted: P OE ------------ 3.10.0-1160.42.2.el7.x86_64 #1
[ 1827.039315] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
[ 1827.039317] Call Trace:
[ 1827.039326] [<ffffffff94183539>] dump_stack+0x19/0x1b
[ 1827.039349] [<ffffffffc035fc5b>] spl_dumpstack+0x2b/0x30 [spl]
[ 1827.039356] [<ffffffffc035fd29>] spl_panic+0xc9/0x110 [spl]
[ 1827.039393] [<ffffffffc0b2d76e>] ? arc_space_consume+0x5e/0x110 [zfs]
[ 1827.039397] [<ffffffff94188632>] ? down_read+0x12/0x40
[ 1827.039429] [<ffffffffc0b4641f>] ? dbuf_read+0xff/0x570 [zfs]
[ 1827.039464] [<ffffffffc0b4d6fe>] ? dmu_bonus_hold_by_dnode+0xde/0x1b0 [zfs]
[ 1827.039472] [<ffffffffc03616b5>] ? spl_kmem_cache_free+0x185/0x210 [spl]
[ 1827.039507] [<ffffffffc0b5a3ea>] receive_object+0x64a/0xc80 [zfs]
[ 1827.039516] [<ffffffffc03607b5>] ? spl_kmem_free+0x35/0x40 [spl]
[ 1827.039520] [<ffffffff93da67fd>] ? list_del+0xd/0x30
[ 1827.039555] [<ffffffffc0b5da14>] receive_writer_thread+0x4e4/0xa30 [zfs]
[ 1827.039559] [<ffffffff93ade305>] ? sched_clock_cpu+0x85/0xc0
[ 1827.039594] [<ffffffffc0b5d530>] ? receive_process_write_record+0x180/0x180 [zfs]
[ 1827.039646] [<ffffffffc0b5d530>] ? receive_process_write_record+0x180/0x180 [zfs]
[ 1827.039657] [<ffffffffc0366873>] thread_generic_wrapper+0x73/0x80 [spl]
[ 1827.039665] [<ffffffffc0366800>] ? __thread_exit+0x20/0x20 [spl]
[ 1827.039669] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 1827.039673] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 1827.039678] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 1827.039681] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.302850] INFO: task z_upgrade:4952 blocked for more than 120 seconds.
[ 2041.302923] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.302999] z_upgrade D ffff9f21aae0d860 0 4952 2 0x00000000
[ 2041.303010] Call Trace:
[ 2041.303027] [<ffffffff94189179>] schedule+0x29/0x70
[ 2041.303037] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2041.303047] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2041.303058] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2041.303066] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2041.303108] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2041.303117] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2041.303137] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2041.303286] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2041.303434] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2041.303533] [<ffffffffc0b57ceb>] dmu_objset_id_quota_upgrade_cb+0x13b/0x170 [zfs]
[ 2041.303628] [<ffffffffc0b53d48>] dmu_objset_upgrade_task_cb+0x88/0xf0 [zfs]
[ 2041.303653] [<ffffffffc03654f6>] taskq_thread+0x2c6/0x520 [spl]
[ 2041.303665] [<ffffffff93adadf0>] ? wake_up_state+0x20/0x20
[ 2041.303689] [<ffffffffc0365230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 2041.303724] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2041.303734] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.303746] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2041.303754] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.303771] INFO: task txg_quiesce:15697 blocked for more than 120 seconds.
[ 2041.303839] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.303913] txg_quiesce D ffff9f2177b3e8e0 0 15697 2 0x00000000
[ 2041.303921] Call Trace:
[ 2041.303933] [<ffffffff94189179>] schedule+0x29/0x70
[ 2041.303952] [<ffffffffc035f325>] cv_wait_common+0x125/0x150 [spl]
[ 2041.303961] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2041.303980] [<ffffffffc035f365>] __cv_wait+0x15/0x20 [spl]
[ 2041.304117] [<ffffffffc0bdda4b>] txg_quiesce_thread+0x2db/0x3e0 [zfs]
[ 2041.304254] [<ffffffffc0bdd770>] ? txg_init+0x2b0/0x2b0 [zfs]
[ 2041.304275] [<ffffffffc0366873>] thread_generic_wrapper+0x73/0x80 [spl]
[ 2041.304297] [<ffffffffc0366800>] ? __thread_exit+0x20/0x20 [spl]
[ 2041.304304] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2041.304313] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.304322] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2041.304329] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.304346] INFO: task z_upgrade:23313 blocked for more than 120 seconds.
[ 2041.304403] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.304468] z_upgrade D ffff9f2199f2a6e0 0 23313 2 0x00000080
[ 2041.304475] Call Trace:
[ 2041.304485] [<ffffffff94189179>] schedule+0x29/0x70
[ 2041.304492] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2041.304509] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2041.304518] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2041.304525] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2041.304542] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2041.304550] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2041.304570] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2041.304688] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2041.304812] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2041.304900] [<ffffffffc0b57ceb>] dmu_objset_id_quota_upgrade_cb+0x13b/0x170 [zfs]
[ 2041.304982] [<ffffffffc0b53d48>] dmu_objset_upgrade_task_cb+0x88/0xf0 [zfs]
[ 2041.305005] [<ffffffffc03654f6>] taskq_thread+0x2c6/0x520 [spl]
[ 2041.305014] [<ffffffff93adadf0>] ? wake_up_state+0x20/0x20
[ 2041.305035] [<ffffffffc0365230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 2041.305043] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2041.305051] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.305060] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2041.305068] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.305078] INFO: task z_upgrade:23403 blocked for more than 120 seconds.
[ 2041.305139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.305209] z_upgrade D ffff9f216c601660 0 23403 2 0x00000080
[ 2041.305215] Call Trace:
[ 2041.305224] [<ffffffff94189179>] schedule+0x29/0x70
[ 2041.305231] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2041.305240] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2041.305248] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2041.305258] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2041.305274] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2041.305282] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2041.305298] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2041.305415] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2041.305533] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2041.305617] [<ffffffffc0b57ceb>] dmu_objset_id_quota_upgrade_cb+0x13b/0x170 [zfs]
[ 2041.305701] [<ffffffffc0b53d48>] dmu_objset_upgrade_task_cb+0x88/0xf0 [zfs]
[ 2041.305732] [<ffffffffc03654f6>] taskq_thread+0x2c6/0x520 [spl]
[ 2041.305740] [<ffffffff93adadf0>] ? wake_up_state+0x20/0x20
[ 2041.305761] [<ffffffffc0365230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 2041.305768] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2041.305776] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.305785] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2041.305794] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.305810] INFO: task receive_writer:60708 blocked for more than 120 seconds.
[ 2041.305871] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.305935] receive_writer D ffff9f214893a100 0 60708 2 0x00000080
[ 2041.305942] Call Trace:
[ 2041.305959] [<ffffffff94189179>] schedule+0x29/0x70
[ 2041.305976] [<ffffffffc035fd55>] spl_panic+0xf5/0x110 [spl]
[ 2041.306048] [<ffffffffc0b2d76e>] ? arc_space_consume+0x5e/0x110 [zfs]
[ 2041.306056] [<ffffffff94188632>] ? down_read+0x12/0x40
[ 2041.306133] [<ffffffffc0b4641f>] ? dbuf_read+0xff/0x570 [zfs]
[ 2041.306216] [<ffffffffc0b4d6fe>] ? dmu_bonus_hold_by_dnode+0xde/0x1b0 [zfs]
[ 2041.306235] [<ffffffffc03616b5>] ? spl_kmem_cache_free+0x185/0x210 [spl]
[ 2041.306320] [<ffffffffc0b5a3ea>] receive_object+0x64a/0xc80 [zfs]
[ 2041.306341] [<ffffffffc03607b5>] ? spl_kmem_free+0x35/0x40 [spl]
[ 2041.306350] [<ffffffff93da67fd>] ? list_del+0xd/0x30
[ 2041.306431] [<ffffffffc0b5da14>] receive_writer_thread+0x4e4/0xa30 [zfs]
[ 2041.306440] [<ffffffff93ade305>] ? sched_clock_cpu+0x85/0xc0
[ 2041.306521] [<ffffffffc0b5d530>] ? receive_process_write_record+0x180/0x180 [zfs]
[ 2041.306601] [<ffffffffc0b5d530>] ? receive_process_write_record+0x180/0x180 [zfs]
[ 2041.306622] [<ffffffffc0366873>] thread_generic_wrapper+0x73/0x80 [spl]
[ 2041.306644] [<ffffffffc0366800>] ? __thread_exit+0x20/0x20 [spl]
[ 2041.306651] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2041.306659] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.306667] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2041.306675] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2041.306696] INFO: task zfs:179453 blocked for more than 120 seconds.
[ 2041.306750] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.306815] zfs D ffff9f21b87b1660 0 179453 179447 0x00000080
[ 2041.306821] Call Trace:
[ 2041.306830] [<ffffffff94189179>] schedule+0x29/0x70
[ 2041.306837] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2041.306845] [<ffffffff94188556>] ? schedule_hrtimeout_range_clock+0xc6/0x150
[ 2041.306851] [<ffffffff93ac9aa0>] ? hrtimer_get_res+0x50/0x50
[ 2041.306861] [<ffffffff93b06992>] ? ktime_get_ts64+0x52/0xf0
[ 2041.306869] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2041.306879] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2041.306886] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2041.306903] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2041.306911] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2041.306927] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2041.307044] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2041.307151] [<ffffffffc0ba0e50>] ? dsl_dataset_user_hold_check_one+0x170/0x170 [zfs]
[ 2041.307269] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2041.307374] [<ffffffffc0b9f94c>] dsl_sync_task_common+0xdc/0x2b0 [zfs]
[ 2041.307475] [<ffffffffc0ba0770>] ? dsl_onexit_hold_cleanup+0xd0/0xd0 [zfs]
[ 2041.307576] [<ffffffffc0ba0e50>] ? dsl_dataset_user_hold_check_one+0x170/0x170 [zfs]
[ 2041.307671] [<ffffffffc0ba0770>] ? dsl_onexit_hold_cleanup+0xd0/0xd0 [zfs]
[ 2041.307775] [<ffffffffc0b9fb46>] dsl_sync_task+0x26/0x30 [zfs]
[ 2041.307871] [<ffffffffc0ba1234>] dsl_dataset_user_hold+0x94/0xf0 [zfs]
[ 2041.307997] [<ffffffffc0c1ecb1>] zfs_ioc_hold+0xc1/0x1b0 [zfs]
[ 2041.308121] [<ffffffffc0c2653b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
[ 2041.308247] [<ffffffffc0c52106>] zfsdev_ioctl+0x56/0xf0 [zfs]
[ 2041.308256] [<ffffffff93c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
[ 2041.308266] [<ffffffff94190678>] ? __do_page_fault+0x238/0x500
[ 2041.308273] [<ffffffff93c63d61>] SyS_ioctl+0xa1/0xc0
[ 2041.308282] [<ffffffff94195f92>] system_call_fastpath+0x25/0x2a
[ 2161.309416] INFO: task z_upgrade:4952 blocked for more than 120 seconds.
[ 2161.309489] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2161.309565] z_upgrade D ffff9f21aae0d860 0 4952 2 0x00000000
[ 2161.309575] Call Trace:
[ 2161.309593] [<ffffffff94189179>] schedule+0x29/0x70
[ 2161.309602] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2161.309613] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2161.309624] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2161.309632] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2161.309674] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2161.309683] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2161.309703] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2161.309849] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2161.309998] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2161.310097] [<ffffffffc0b57ceb>] dmu_objset_id_quota_upgrade_cb+0x13b/0x170 [zfs]
[ 2161.310191] [<ffffffffc0b53d48>] dmu_objset_upgrade_task_cb+0x88/0xf0 [zfs]
[ 2161.310217] [<ffffffffc03654f6>] taskq_thread+0x2c6/0x520 [spl]
[ 2161.310229] [<ffffffff93adadf0>] ? wake_up_state+0x20/0x20
[ 2161.310253] [<ffffffffc0365230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 2161.310282] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2161.310292] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.310304] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2161.310312] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.310329] INFO: task txg_quiesce:15697 blocked for more than 120 seconds.
[ 2161.310397] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2161.310471] txg_quiesce D ffff9f2177b3e8e0 0 15697 2 0x00000000
[ 2161.310480] Call Trace:
[ 2161.310492] [<ffffffff94189179>] schedule+0x29/0x70
[ 2161.310511] [<ffffffffc035f325>] cv_wait_common+0x125/0x150 [spl]
[ 2161.310520] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2161.310548] [<ffffffffc035f365>] __cv_wait+0x15/0x20 [spl]
[ 2161.310683] [<ffffffffc0bdda4b>] txg_quiesce_thread+0x2db/0x3e0 [zfs]
[ 2161.310818] [<ffffffffc0bdd770>] ? txg_init+0x2b0/0x2b0 [zfs]
[ 2161.310844] [<ffffffffc0366873>] thread_generic_wrapper+0x73/0x80 [spl]
[ 2161.310865] [<ffffffffc0366800>] ? __thread_exit+0x20/0x20 [spl]
[ 2161.310873] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2161.310882] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.310892] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2161.310901] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.310917] INFO: task z_upgrade:23313 blocked for more than 120 seconds.
[ 2161.310986] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2161.311064] z_upgrade D ffff9f2199f2a6e0 0 23313 2 0x00000080
[ 2161.311072] Call Trace:
[ 2161.311082] [<ffffffff94189179>] schedule+0x29/0x70
[ 2161.311091] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2161.311103] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2161.311112] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2161.311120] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2161.311139] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2161.311148] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2161.311167] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2161.311302] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2161.311443] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2161.311540] [<ffffffffc0b57ceb>] dmu_objset_id_quota_upgrade_cb+0x13b/0x170 [zfs]
[ 2161.311633] [<ffffffffc0b53d48>] dmu_objset_upgrade_task_cb+0x88/0xf0 [zfs]
[ 2161.311658] [<ffffffffc03654f6>] taskq_thread+0x2c6/0x520 [spl]
[ 2161.311669] [<ffffffff93adadf0>] ? wake_up_state+0x20/0x20
[ 2161.311692] [<ffffffffc0365230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 2161.311700] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2161.311709] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.311722] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2161.311730] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.311738] INFO: task z_upgrade:23403 blocked for more than 120 seconds.
[ 2161.311803] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2161.311876] z_upgrade D ffff9f216c601660 0 23403 2 0x00000080
[ 2161.311883] Call Trace:
[ 2161.311893] [<ffffffff94189179>] schedule+0x29/0x70
[ 2161.311901] [<ffffffff94186e41>] schedule_timeout+0x221/0x2d0
[ 2161.311911] [<ffffffff94188a2d>] io_schedule_timeout+0xad/0x130
[ 2161.311920] [<ffffffff93ac6ad6>] ? prepare_to_wait_exclusive+0x56/0x90
[ 2161.311928] [<ffffffff94188ac8>] io_schedule+0x18/0x20
[ 2161.311949] [<ffffffffc035f2b2>] cv_wait_common+0xb2/0x150 [spl]
[ 2161.311958] [<ffffffff93ac6f50>] ? wake_up_atomic_t+0x30/0x30
[ 2161.311976] [<ffffffffc035f388>] __cv_wait_io+0x18/0x20 [spl]
[ 2161.312109] [<ffffffffc0bdd265>] txg_wait_synced_impl+0xe5/0x130 [zfs]
[ 2161.312242] [<ffffffffc0bdd2c0>] txg_wait_synced+0x10/0x40 [zfs]
[ 2161.312346] [<ffffffffc0b57ceb>] dmu_objset_id_quota_upgrade_cb+0x13b/0x170 [zfs]
[ 2161.312440] [<ffffffffc0b53d48>] dmu_objset_upgrade_task_cb+0x88/0xf0 [zfs]
[ 2161.312465] [<ffffffffc03654f6>] taskq_thread+0x2c6/0x520 [spl]
[ 2161.312474] [<ffffffff93adadf0>] ? wake_up_state+0x20/0x20
[ 2161.312498] [<ffffffffc0365230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 2161.312506] [<ffffffff93ac5e61>] kthread+0xd1/0xe0
[ 2161.312515] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
[ 2161.312525] [<ffffffff94195ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 2161.312535] [<ffffffff93ac5d90>] ? insert_kthread_work+0x40/0x40
Despite the above zpool status
claims the scrub is continuing, but I see no workers that could be processing this in top. There are several z_upgrades listed but these are all in the 'D' state. I shall reboot again.
RE my system. It's a relatively low-powered dual socket system and under normal operation I would expect to see a load of around 50% most of the time. It only has 64GB of RAM (pool size of 432TB, made of 5x12 disk raidz2). It sits as the middle node of our backup chain, mostly pulling receives from production file servers, but also accepting pushed receives from one server, rsync backups from non-ZFS clients and a single-node Minio to provide an S3 backup target for a service. A second ZFS backup host pulls snapshots from this device to provide additional offsite data protection. Sanoid is used to manage snapshot retention (and take snapshots of the non-ZFS backups) and syncoid pulls/sends data (mostly recursively from a common root). Syncoids ran hourly across the ca 10 recursive filesystem groups originally 24/7. We've reduced this to every two hours and ensured a five minute spread between filesystems to try to reduce the number of concurrent receives. Encryption is used for all filesystem roots within the pool (save the top-level and a reservation space). The server has 2x10Gbit uplinks, but we rarely see transfer rates much past 1Gbit speeds.
As to the patch, I git a git apply of the raw version of the patch on github, it didn't report any issues, so I believe it applied cleanly. Here's a git diff of arc.c:
index e70b37d..1ff1116 100644
--- a/module/zfs/arc.c
+++ b/module/zfs/arc.c
@@ -2000,6 +2000,7 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
arc_fill_flags_t flags)
{
int error = 0;
+ boolean_t big_mutex = B_FALSE;
arc_buf_hdr_t *hdr = buf->b_hdr;
boolean_t hdr_compressed =
(arc_hdr_get_compress(hdr) != ZIO_COMPRESS_OFF);
@@ -2008,6 +2009,27 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
dmu_object_byteswap_t bswap = hdr->b_l1hdr.b_byteswap;
kmutex_t *hash_lock = (flags & ARC_FILL_LOCKED) ? NULL : HDR_LOCK(hdr);
+ /*
+ * If we're playing games with in-place fill, we need to be
+ * certain nobody's going to temporarily free and reallocate
+ * our abds, which can happen if e.g. someone enters dbuf_write after
+ * we start dbuf_read->...->arc_buf_fill but before we finish.
+ *
+ * This avoids that terrible fate, because dbuf_write's arc_release
+ * takes this lock before freeing anybody, and dbuf_write always does
+ * dbuf_release_bp->arc_release first in this case.
+ *
+ * hash_lock appeared to be insufficient for this purpose because it's
+ * on the hdr, not the buf, and it seems to be the case that you can
+ * wind up with multiple locations referencing the same arc_buf_t here.
+ */
+ if ((flags & ARC_FILL_IN_PLACE)) {
+ if (!MUTEX_HELD(&buf->b_evict_lock)) {
+ mutex_enter(&buf->b_evict_lock);
+ big_mutex = B_TRUE;
+ }
+ }
+
ASSERT3P(buf->b_data, !=, NULL);
IMPLY(compressed, hdr_compressed || ARC_BUF_ENCRYPTED(buf));
IMPLY(compressed, ARC_BUF_COMPRESSED(buf));
@@ -2038,6 +2060,8 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
error = arc_fill_hdr_crypt(hdr, hash_lock, spa,
zb, !!(flags & ARC_FILL_NOAUTH));
if (error == EACCES && (flags & ARC_FILL_IN_PLACE) != 0) {
+ if (big_mutex)
+ mutex_exit(&buf->b_evict_lock);
return (error);
} else if (error != 0) {
if (hash_lock != NULL)
@@ -2045,6 +2069,8 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
arc_hdr_set_flags(hdr, ARC_FLAG_IO_ERROR);
if (hash_lock != NULL)
mutex_exit(hash_lock);
+ if (big_mutex)
+ mutex_exit(&buf->b_evict_lock);
return (error);
}
}
@@ -2078,9 +2104,16 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
arc_cksum_compute(buf);
}
+ mutex_exit(&buf->b_evict_lock);
return (0);
}
+ /*
+ * If we get here, it should be logically impossible for big_mutex
+ * to be held.
+ */
+ ASSERT(big_mutex == B_FALSE);
+
if (hdr_compressed == compressed) {
if (!arc_buf_is_shared(buf)) {
abd_copy_to_buf(buf->b_data, hdr->b_l1hdr.b_pabd,
@@ -2158,6 +2191,12 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
}
byteswap:
+ /*
+ * If we get here, it should be logically impossible for big_mutex
+ * to be held, since FILL_IN_PLACE should be mutually exclusive with
+ * encrypted.
+ */
+ ASSERT(big_mutex == B_FALSE);
/* Byteswap the buf's data if necessary */
if (bswap != DMU_BSWAP_NUMFUNCS) {
ASSERT(!HDR_SHARED_DATA(hdr));
Yeah, my interpretation of those traces would be that the workers are all blocked ~forever.
As far as this panic...oof. That's neat. While I try to reproduce it, could you follow mahrens' instructions for SET_ERROR dbgmsg output here?
I'm now seeing crashes shortly after booting. The scrub is auto-resuming and z_upgrade is running along with what I assume is log replay...
top - 11:11:12 up 6 min, 2 users, load average: 46.55, 24.34, 10.16
Tasks: 717 total, 7 running, 710 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 45.0 sy, 0.0 ni, 18.0 id, 34.3 wa, 0.0 hi, 2.5 si, 0.0 st
KiB Mem : 65427352 total, 46043228 free, 19119368 used, 264756 buff/cache
KiB Swap: 16412668 total, 16412668 free, 0 used. 45814376 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1049 root 20 0 0 0 0 R 17.8 0.0 0:32.28 dbu_evict
4902 root 0 -20 0 0 0 R 15.5 0.0 0:28.28 z_rd_int
4904 root 0 -20 0 0 0 S 15.5 0.0 0:28.21 z_rd_int
4905 root 0 -20 0 0 0 S 15.5 0.0 0:28.05 z_rd_int
4906 root 0 -20 0 0 0 S 15.5 0.0 0:28.07 z_rd_int
4908 root 0 -20 0 0 0 S 15.5 0.0 0:28.17 z_rd_int
4909 root 0 -20 0 0 0 S 15.5 0.0 0:28.19 z_rd_int
4903 root 0 -20 0 0 0 R 15.2 0.0 0:28.22 z_rd_int
4907 root 0 -20 0 0 0 S 14.5 0.0 0:28.18 z_rd_int
141645 root 20 0 0 0 0 D 14.5 0.0 0:24.31 z_upgrade
4912 root 1 -19 0 0 0 S 12.9 0.0 0:21.92 z_wr_iss
4919 root 1 -19 0 0 0 S 12.9 0.0 0:21.53 z_wr_iss
4924 root 1 -19 0 0 0 S 12.9 0.0 0:21.34 z_wr_iss
4911 root 1 -19 0 0 0 S 12.5 0.0 0:20.62 z_wr_iss
4917 root 1 -19 0 0 0 S 12.5 0.0 0:21.15 z_wr_iss
4923 root 1 -19 0 0 0 S 12.2 0.0 0:20.94 z_wr_iss
4915 root 1 -19 0 0 0 S 11.9 0.0 0:21.30 z_wr_iss
4920 root 1 -19 0 0 0 S 11.9 0.0 0:21.72 z_wr_iss
4913 root 1 -19 0 0 0 S 11.6 0.0 0:21.31 z_wr_iss
4916 root 1 -19 0 0 0 S 11.6 0.0 0:20.85 z_wr_iss
4922 root 1 -19 0 0 0 S 11.6 0.0 0:21.49 z_wr_iss
141663 root 20 0 0 0 0 D 11.6 0.0 0:24.21 z_upgrade
2 root 20 0 0 0 0 S 11.2 0.0 0:17.28 kthreadd
4910 root 1 -19 0 0 0 S 11.2 0.0 0:21.03 z_wr_iss
4914 root 1 -19 0 0 0 S 11.2 0.0 0:21.54 z_wr_iss
142173 root 20 0 0 0 0 D 11.2 0.0 0:22.94 z_upgrade
4918 root 1 -19 0 0 0 S 10.9 0.0 0:21.34 z_wr_iss
141967 root 20 0 0 0 0 D 10.6 0.0 0:20.86 z_upgrade
4921 root 1 -19 0 0 0 S 10.2 0.0 0:20.91 z_wr_iss
4952 root 20 0 0 0 0 D 9.9 0.0 0:20.36 z_upgrade
142107 root 20 0 0 0 0 D 9.9 0.0 0:20.66 z_upgrade
211729 root 39 19 0 0 0 D 9.9 0.0 0:11.94 dsl_scan_iss
211727 root 39 19 0 0 0 D 8.9 0.0 0:12.45 dsl_scan_iss
211731 root 39 19 0 0 0 D 8.9 0.0 0:12.32 dsl_scan_iss
142115 root 20 0 0 0 0 D 8.6 0.0 0:24.07 z_upgrade
211728 root 39 19 0 0 0 D 8.3 0.0 0:12.15 dsl_scan_iss
142114 root 20 0 0 0 0 D 7.9 0.0 0:20.79 z_upgrade
142123 root 20 0 0 0 0 D 7.9 0.0 0:19.20 z_upgrade
141647 root 20 0 0 0 0 D 7.6 0.0 0:22.49 z_upgrade
141833 root 20 0 0 0 0 D 7.6 0.0 0:18.95 z_upgrade
141638 root 20 0 0 0 0 D 7.3 0.0 0:18.40 z_upgrade
141683 root 20 0 0 0 0 D 7.3 0.0 0:18.85 z_upgrade
141835 root 20 0 0 0 0 D 7.3 0.0 0:20.11 z_upgrade
211730 root 39 19 0 0 0 D 7.3 0.0 0:12.48 dsl_scan_iss
963 root 0 -20 0 0 0 D 6.6 0.0 0:10.72 spl_dynamic_tas
141600 root 20 0 0 0 0 D 6.6 0.0 0:20.82 z_upgrade
142146 root 20 0 0 0 0 D 6.6 0.0 0:21.27 z_upgrade
5027 root 39 19 0 0 0 S 5.6 0.0 0:21.34 dp_sync_taskq
5032 root 39 19 0 0 0 S 5.3 0.0 0:16.89 dp_sync_taskq
5036 root 39 19 0 0 0 S 5.3 0.0 0:11.91 dp_sync_taskq
141654 root 20 0 0 0 0 D 4.6 0.0 0:13.70 z_upgrade
141658 root 20 0 0 0 0 D 4.6 0.0 0:13.32 z_upgrade
4926 root 0 -20 0 0 0 S 4.3 0.0 0:07.43 z_wr_int
4930 root 0 -20 0 0 0 S 4.3 0.0 0:07.40 z_wr_int
4931 root 0 -20 0 0 0 S 4.3 0.0 0:07.25 z_wr_int
4932 root 0 -20 0 0 0 S 4.3 0.0 0:07.36 z_wr_int
5026 root 39 19 0 0 0 S 4.3 0.0 0:07.70 dp_sync_taskq
5030 root 39 19 0 0 0 S 4.3 0.0 0:18.38 dp_sync_taskq
4927 root 0 -20 0 0 0 S 4.0 0.0 0:07.35 z_wr_int
5031 root 39 19 0 0 0 S 4.0 0.0 0:08.88 dp_sync_taskq
5035 root 39 19 0 0 0 S 4.0 0.0 0:07.60 dp_sync_taskq
4928 root 0 -20 0 0 0 S 3.6 0.0 0:07.32 z_wr_int
4929 root 0 -20 0 0 0 S 3.6 0.0 0:07.27 z_wr_int
5023 root 39 19 0 0 0 S 3.6 0.0 0:10.99 dp_sync_taskq
5024 root 39 19 0 0 0 S 3.6 0.0 0:10.06 dp_sync_taskq
5037 root 39 19 0 0 0 S 3.6 0.0 0:06.99 dp_sync_taskq
4933 root 0 -20 0 0 0 S 3.3 0.0 0:07.31 z_wr_int
141648 root 20 0 0 0 0 D 3.3 0.0 0:06.86 z_upgrade
5025 root 39 19 0 0 0 S 3.0 0.0 0:08.08 dp_sync_taskq
21498 root 20 0 0 0 0 D 3.0 0.0 1:26.54 txg_sync
141644 root 20 0 0 0 0 D 3.0 0.0 0:20.60 z_upgrade
1697 root 0 -20 0 0 0 S 2.3 0.0 0:04.64 kworker/2:1H
5029 root 39 19 0 0 0 S 2.3 0.0 0:12.93 dp_sync_taskq
5033 root 39 19 0 0 0 S 2.3 0.0 0:12.91 dp_sync_taskq
5034 root 39 19 0 0 0 S 2.3 0.0 0:09.23 dp_sync_taskq
648 root 0 -20 0 0 0 S 2.0 0.0 0:04.43 kworker/0:1H
659 root 0 -20 0 0 0 R 2.0 0.0 0:04.57 kworker/4:1H
1705 root 0 -20 0 0 0 S 2.0 0.0 0:04.44 kworker/5:1H
I've disabled syncoid for a while to let the system stabilise (I hope).
(writing this before your response above, will look at the reference now)
Hi,
Update on this. With the cron'd syncoids disabled the system was stable. Yesterday afternoon I ran a few syncs interactively and these did complete. This morning I started an interactive recursive sync of a filesystem tree and all was going well but a few minutes ago I started seeing:
kernel:NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
every few seconds, with the following in messages:
[root@fs3 ~]# grep kernel /var/log/messages | grep 'Jan 18'
Jan 18 10:50:31 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:50:31 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:50:31 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:50:31 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OE ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:50:31 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:50:31 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:50:31 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:50:31 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:50:31 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:50:31 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:50:31 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:50:31 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:50:31 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:50:31 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:50:31 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:50:31 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:50:31 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:50:31 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:50:31 fs3 kernel: PKRU: 55555554
Jan 18 10:50:31 fs3 kernel: Call Trace:
Jan 18 10:50:31 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc1334ba8>] ? dmu_objset_pool+0x18/0x40 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:50:31 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:50:31 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:50:31 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:50:31 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Jan 18 10:50:59 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:50:59 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:50:59 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:50:59 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:50:59 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:50:59 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:50:59 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:50:59 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:50:59 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:50:59 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:50:59 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:50:59 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:50:59 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:50:59 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:50:59 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:50:59 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:50:59 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:50:59 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:50:59 fs3 kernel: PKRU: 55555554
Jan 18 10:50:59 fs3 kernel: Call Trace:
Jan 18 10:50:59 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:50:59 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:50:59 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:50:59 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:50:59 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Jan 18 10:51:04 fs3 kernel: INFO: rcu_sched self-detected stall on CPU { 4} (t=60000 jiffies g=7774743 c=7774742 q=278193)
Jan 18 10:51:04 fs3 kernel: Task dump for CPU 4:
Jan 18 10:51:04 fs3 kernel: zfs R running task 0 169169 169152 0x00000088
Jan 18 10:51:04 fs3 kernel: Call Trace:
Jan 18 10:51:04 fs3 kernel: <IRQ> [<ffffffff98ada3c8>] sched_show_task+0xa8/0x110
Jan 18 10:51:04 fs3 kernel: [<ffffffff98ade039>] dump_cpu_task+0x39/0x70
Jan 18 10:51:04 fs3 kernel: [<ffffffff98b586c0>] rcu_dump_cpu_stacks+0x90/0xd0
Jan 18 10:51:04 fs3 kernel: [<ffffffff98b5bd82>] rcu_check_callbacks+0x442/0x730
Jan 18 10:51:04 fs3 kernel: [<ffffffff98b10700>] ? tick_sched_do_timer+0x50/0x50
Jan 18 10:51:04 fs3 kernel: [<ffffffff98aaf176>] update_process_times+0x46/0x80
Jan 18 10:51:04 fs3 kernel: [<ffffffff98b10470>] tick_sched_handle+0x30/0x70
Jan 18 10:51:04 fs3 kernel: [<ffffffff98b10739>] tick_sched_timer+0x39/0x80
Jan 18 10:51:04 fs3 kernel: [<ffffffff98aca25e>] __hrtimer_run_queues+0x10e/0x270
Jan 18 10:51:04 fs3 kernel: [<ffffffff98aca7bf>] hrtimer_interrupt+0xaf/0x1d0
Jan 18 10:51:04 fs3 kernel: [<ffffffff98a5cdfb>] local_apic_timer_interrupt+0x3b/0x60
Jan 18 10:51:04 fs3 kernel: [<ffffffff9919aa23>] smp_apic_timer_interrupt+0x43/0x60
Jan 18 10:51:04 fs3 kernel: [<ffffffff99196fba>] apic_timer_interrupt+0x16a/0x170
Jan 18 10:51:04 fs3 kernel: <EOI> [<ffffffffc05e7bf0>] ? taskq_wait+0xf0/0xf0 [spl]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc1432c09>] ? zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:51:04 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:51:04 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:51:04 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:51:31 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [zfs:169169]
Jan 18 10:51:31 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:51:31 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:51:31 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:51:31 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:51:31 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:51:31 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:51:31 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:51:31 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:51:31 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:51:31 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:51:31 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:51:31 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:51:31 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:51:31 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:51:31 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:51:31 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:51:31 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:51:31 fs3 kernel: PKRU: 55555554
Jan 18 10:51:31 fs3 kernel: Call Trace:
Jan 18 10:51:31 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc1334bb4>] ? dmu_objset_pool+0x24/0x40 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:51:31 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:51:31 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:51:31 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:51:31 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Jan 18 10:51:59 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:51:59 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:51:59 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:51:59 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:51:59 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:51:59 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:51:59 fs3 kernel: RIP: 0010:[<ffffffffc05e7bf9>] [<ffffffffc05e7bf9>] taskq_wait_outstanding+0x9/0xf0 [spl]
Jan 18 10:51:59 fs3 kernel: RSP: 0018:ffff88a50ff57938 EFLAGS: 00000246
Jan 18 10:51:59 fs3 kernel: RAX: ffff88b0aa427800 RBX: 0000000000000010 RCX: 0000000000000001
Jan 18 10:51:59 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88b0aa427800
Jan 18 10:51:59 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:51:59 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffffffffff10
Jan 18 10:51:59 fs3 kernel: R13: 0000000000000246 R14: 0000000000000246 R15: 0000000000000001
Jan 18 10:51:59 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:51:59 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:51:59 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:51:59 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:51:59 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:51:59 fs3 kernel: PKRU: 55555554
Jan 18 10:51:59 fs3 kernel: Call Trace:
Jan 18 10:51:59 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:51:59 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:51:59 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:51:59 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:51:59 fs3 kernel: Code: 00 75 0d 48 83 c4 30 5b 41 5c 41 5d 41 5e 5d c3 e8 ed 33 4b d8 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 <41> 56 41 55 49 89 f5 41 54 53 48 89 fb 48 83 ec 30 65 48 8b 04
Jan 18 10:52:27 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:52:27 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:52:27 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:52:27 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:52:27 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:52:27 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:52:27 fs3 kernel: RIP: 0010:[<ffffffffc05e7cbe>] [<ffffffffc05e7cbe>] taskq_wait_outstanding+0xce/0xf0 [spl]
Jan 18 10:52:27 fs3 kernel: RSP: 0018:ffff88a50ff578e8 EFLAGS: 00000297
Jan 18 10:52:27 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:52:27 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:52:27 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:52:27 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: 0000000000000001
Jan 18 10:52:27 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98ac6f7b R15: ffff88a50ff578d8
Jan 18 10:52:27 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:52:27 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:52:27 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:52:27 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:52:27 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:52:27 fs3 kernel: PKRU: 55555554
Jan 18 10:52:27 fs3 kernel: Call Trace:
Jan 18 10:52:27 fs3 kernel: [<ffffffffc1334bad>] ? dmu_objset_pool+0x1d/0x40 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:52:27 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:52:27 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:52:27 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:52:27 fs3 kernel: Code: ed 4d d8 48 89 df e8 52 3e ba d8 4c 8b 63 68 48 89 c6 48 89 df e8 d3 3a ba d8 4d 39 e5 73 ce 48 8d 75 b0 4c 89 f7 e8 92 ee 4d d8 <48> 8b 45 d8 65 48 33 04 25 28 00 00 00 75 0d 48 83 c4 30 5b 41
Jan 18 10:52:55 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:52:55 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:52:55 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:52:55 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:52:55 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:52:55 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:52:55 fs3 kernel: RIP: 0010:[<ffffffffc1334b90>] [<ffffffffc1334b90>] dmu_objset_pool+0x0/0x40 [zfs]
Jan 18 10:52:55 fs3 kernel: RSP: 0018:ffff88a50ff57940 EFLAGS: 00000202
Jan 18 10:52:55 fs3 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:52:55 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffff88ae9bd7e000
Jan 18 10:52:55 fs3 kernel: RBP: ffff88a50ff57970 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:52:55 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffffffffff10
Jan 18 10:52:55 fs3 kernel: R13: ffff88a50ff57970 R14: 0000000000000246 R15: 0000000000000001
Jan 18 10:52:55 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:52:55 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:52:55 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:52:55 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:52:55 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:52:55 fs3 kernel: PKRU: 55555554
Jan 18 10:52:55 fs3 kernel: Call Trace:
Jan 18 10:52:55 fs3 kernel: [<ffffffffc1432bf7>] ? zfsvfs_teardown+0x47/0x2e0 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:52:55 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:52:55 fs3 kernel: Code: e8 46 e0 07 00 89 d8 48 8b 4d d8 65 48 33 0c 25 28 00 00 00 75 0d 48 83 c4 10 5b 41 5c 41 5d 41 5e 5d c3 e8 43 64 76 d7 0f 1f 00 <0f> 1f 44 00 00 55 48 8b 07 48 89 e5 48 85 c0 74 1f 48 8b 80 60
Jan 18 10:52:55 fs3 kernel: INFO: task md126_raid1:647 blocked for more than 120 seconds.
Jan 18 10:52:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:52:55 fs3 kernel: md126_raid1 D ffff88b0b01e2100 0 647 2 0x00000000
Jan 18 10:52:55 fs3 kernel: Call Trace:
Jan 18 10:52:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:52:55 fs3 kernel: [<ffffffff98d8bd35>] percpu_ref_switch_to_atomic_sync+0x65/0xb0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac6f50>] ? wake_up_atomic_t+0x30/0x30
Jan 18 10:52:55 fs3 kernel: [<ffffffff98fa69f7>] set_in_sync+0x67/0xe0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98fae8ef>] md_check_recovery+0x27f/0x500
Jan 18 10:52:55 fs3 kernel: [<ffffffffc05dc661>] raid1d+0x51/0x900 [raid1]
Jan 18 10:52:55 fs3 kernel: [<ffffffff98aae162>] ? del_timer_sync+0x52/0x60
Jan 18 10:52:55 fs3 kernel: [<ffffffff99186d90>] ? schedule_timeout+0x170/0x2d0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98aad8b0>] ? requeue_timers+0x170/0x170
Jan 18 10:52:55 fs3 kernel: [<ffffffff98fa5f3d>] md_thread+0x16d/0x1e0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac6f50>] ? wake_up_atomic_t+0x30/0x30
Jan 18 10:52:55 fs3 kernel: [<ffffffff98fa5dd0>] ? find_pers+0x80/0x80
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac5e61>] kthread+0xd1/0xe0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:52:55 fs3 kernel: [<ffffffff99195ddd>] ret_from_fork_nospec_begin+0x7/0x21
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:52:55 fs3 kernel: INFO: task xfsaild/md126:1366 blocked for more than 120 seconds.
Jan 18 10:52:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:52:55 fs3 kernel: xfsaild/md126 D ffff88b0ae9c26e0 0 1366 2 0x00000000
Jan 18 10:52:55 fs3 kernel: Call Trace:
Jan 18 10:52:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:52:55 fs3 kernel: [<ffffffffc09cd727>] xfs_log_force+0x157/0x2e0 [xfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffff98adadf0>] ? wake_up_state+0x20/0x20
Jan 18 10:52:55 fs3 kernel: [<ffffffffc09d9a70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc09d9c00>] xfsaild+0x190/0x780 [xfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffffc09d9a70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac5e61>] kthread+0xd1/0xe0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:52:55 fs3 kernel: [<ffffffff99195ddd>] ret_from_fork_nospec_begin+0x7/0x21
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:52:55 fs3 kernel: INFO: task auditd:103142 blocked for more than 120 seconds.
Jan 18 10:52:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:52:55 fs3 kernel: auditd D ffff88b064280000 0 103142 1 0x00000000
Jan 18 10:52:55 fs3 kernel: Call Trace:
Jan 18 10:52:55 fs3 kernel: [<ffffffff99187480>] ? bit_wait+0x50/0x50
Jan 18 10:52:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:52:55 fs3 kernel: [<ffffffff99186e41>] schedule_timeout+0x221/0x2d0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98fa7a70>] ? md_handle_request+0xd0/0x150
Jan 18 10:52:55 fs3 kernel: [<ffffffff98b06992>] ? ktime_get_ts64+0x52/0xf0
Jan 18 10:52:55 fs3 kernel: [<ffffffff99187480>] ? bit_wait+0x50/0x50
Jan 18 10:52:55 fs3 kernel: [<ffffffff99188a2d>] io_schedule_timeout+0xad/0x130
Jan 18 10:52:55 fs3 kernel: [<ffffffff99188ac8>] io_schedule+0x18/0x20
Jan 18 10:52:55 fs3 kernel: [<ffffffff99187491>] bit_wait_io+0x11/0x50
Jan 18 10:52:55 fs3 kernel: [<ffffffff99186fb7>] __wait_on_bit+0x67/0x90
Jan 18 10:52:55 fs3 kernel: [<ffffffff98bbd3c1>] wait_on_page_bit+0x81/0xa0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98ac7010>] ? wake_bit_function+0x40/0x40
Jan 18 10:52:55 fs3 kernel: [<ffffffff98bbd4f1>] __filemap_fdatawait_range+0x111/0x190
Jan 18 10:52:55 fs3 kernel: [<ffffffff98bcb9b1>] ? do_writepages+0x21/0x50
Jan 18 10:52:55 fs3 kernel: [<ffffffff98bbd584>] filemap_fdatawait_range+0x14/0x30
Jan 18 10:52:55 fs3 kernel: [<ffffffff98bbff76>] filemap_write_and_wait_range+0x56/0x90
Jan 18 10:52:55 fs3 kernel: [<ffffffffc09add66>] xfs_file_fsync+0x66/0x1c0 [xfs]
Jan 18 10:52:55 fs3 kernel: [<ffffffff98c84207>] do_fsync+0x67/0xb0
Jan 18 10:52:55 fs3 kernel: [<ffffffff98c844f0>] SyS_fsync+0x10/0x20
Jan 18 10:52:55 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:53:23 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:53:23 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:53:23 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:53:23 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:53:23 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:53:23 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:53:23 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:53:23 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:53:23 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:53:23 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:53:23 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:53:23 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:53:23 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:53:23 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:53:23 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:53:23 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:53:23 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:53:23 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:53:23 fs3 kernel: PKRU: 55555554
Jan 18 10:53:23 fs3 kernel: Call Trace:
Jan 18 10:53:23 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc1334bad>] ? dmu_objset_pool+0x1d/0x40 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:53:23 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:53:23 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:53:23 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:53:23 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Jan 18 10:53:51 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:53:51 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:53:51 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:53:51 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:53:51 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:53:51 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:53:51 fs3 kernel: RIP: 0010:[<ffffffffc05e7ccd>] [<ffffffffc05e7ccd>] taskq_wait_outstanding+0xdd/0xf0 [spl]
Jan 18 10:53:51 fs3 kernel: RSP: 0018:ffff88a50ff578e8 EFLAGS: 00000246
Jan 18 10:53:51 fs3 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:53:51 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:53:51 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:53:51 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: 0000000000000001
Jan 18 10:53:51 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98ac6f7b R15: ffff88a50ff578d8
Jan 18 10:53:51 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:53:51 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:53:51 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:53:51 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:53:51 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:53:51 fs3 kernel: PKRU: 55555554
Jan 18 10:53:51 fs3 kernel: Call Trace:
Jan 18 10:53:51 fs3 kernel: [<ffffffffc1372f4d>] ? dsl_pool_zrele_taskq+0xd/0x10 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:53:51 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:53:51 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:53:51 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:53:51 fs3 kernel: Code: 48 89 c6 48 89 df e8 d3 3a ba d8 4d 39 e5 73 ce 48 8d 75 b0 4c 89 f7 e8 92 ee 4d d8 48 8b 45 d8 65 48 33 04 25 28 00 00 00 75 0d <48> 83 c4 30 5b 41 5c 41 5d 41 5e 5d c3 e8 f1 32 4b d8 90 0f 1f
Jan 18 10:54:04 fs3 kernel: INFO: rcu_sched self-detected stall on CPU { 4} (t=240004 jiffies g=7774743 c=7774742 q=840723)
Jan 18 10:54:04 fs3 kernel: Task dump for CPU 4:
Jan 18 10:54:04 fs3 kernel: zfs R running task 0 169169 169152 0x00000088
Jan 18 10:54:04 fs3 kernel: Call Trace:
Jan 18 10:54:04 fs3 kernel: <IRQ> [<ffffffff98ada3c8>] sched_show_task+0xa8/0x110
Jan 18 10:54:04 fs3 kernel: [<ffffffff98ade039>] dump_cpu_task+0x39/0x70
Jan 18 10:54:04 fs3 kernel: [<ffffffff98b586c0>] rcu_dump_cpu_stacks+0x90/0xd0
Jan 18 10:54:04 fs3 kernel: [<ffffffff98b5bd82>] rcu_check_callbacks+0x442/0x730
Jan 18 10:54:04 fs3 kernel: [<ffffffff98b10700>] ? tick_sched_do_timer+0x50/0x50
Jan 18 10:54:04 fs3 kernel: [<ffffffff98aaf176>] update_process_times+0x46/0x80
Jan 18 10:54:04 fs3 kernel: [<ffffffff98b10470>] tick_sched_handle+0x30/0x70
Jan 18 10:54:04 fs3 kernel: [<ffffffff98b10739>] tick_sched_timer+0x39/0x80
Jan 18 10:54:04 fs3 kernel: [<ffffffff98aca25e>] __hrtimer_run_queues+0x10e/0x270
Jan 18 10:54:04 fs3 kernel: [<ffffffff98aca7bf>] hrtimer_interrupt+0xaf/0x1d0
Jan 18 10:54:04 fs3 kernel: [<ffffffff98a5cdfb>] local_apic_timer_interrupt+0x3b/0x60
Jan 18 10:54:04 fs3 kernel: [<ffffffff9919aa23>] smp_apic_timer_interrupt+0x43/0x60
Jan 18 10:54:04 fs3 kernel: [<ffffffff99196fba>] apic_timer_interrupt+0x16a/0x170
Jan 18 10:54:04 fs3 kernel: <EOI> [<ffffffff9918b795>] ? _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:54:04 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc1334bad>] ? dmu_objset_pool+0x1d/0x40 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:54:04 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:54:04 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:54:04 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:54:31 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
Jan 18 10:54:31 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:54:31 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:54:31 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:54:31 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:54:31 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:54:31 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:54:31 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:54:31 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:54:31 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:54:31 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:54:31 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:54:31 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:54:31 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:54:31 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:54:31 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:54:31 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:54:31 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:54:31 fs3 kernel: PKRU: 55555554
Jan 18 10:54:31 fs3 kernel: Call Trace:
Jan 18 10:54:31 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc1372f4a>] ? dsl_pool_zrele_taskq+0xa/0x10 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:54:31 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:54:31 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:54:31 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:54:31 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Jan 18 10:54:55 fs3 kernel: INFO: task md126_raid1:647 blocked for more than 120 seconds.
Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:54:55 fs3 kernel: md126_raid1 D ffff88b0b01e2100 0 647 2 0x00000000
Jan 18 10:54:55 fs3 kernel: Call Trace:
Jan 18 10:54:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:54:55 fs3 kernel: [<ffffffff98d8bd35>] percpu_ref_switch_to_atomic_sync+0x65/0xb0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac6f50>] ? wake_up_atomic_t+0x30/0x30
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa69f7>] set_in_sync+0x67/0xe0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fae8ef>] md_check_recovery+0x27f/0x500
Jan 18 10:54:55 fs3 kernel: [<ffffffffc05dc661>] raid1d+0x51/0x900 [raid1]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98aae162>] ? del_timer_sync+0x52/0x60
Jan 18 10:54:55 fs3 kernel: [<ffffffff99186d90>] ? schedule_timeout+0x170/0x2d0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98aad8b0>] ? requeue_timers+0x170/0x170
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa5f3d>] md_thread+0x16d/0x1e0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac6f50>] ? wake_up_atomic_t+0x30/0x30
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa5dd0>] ? find_pers+0x80/0x80
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5e61>] kthread+0xd1/0xe0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:54:55 fs3 kernel: [<ffffffff99195ddd>] ret_from_fork_nospec_begin+0x7/0x21
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:54:55 fs3 kernel: INFO: task xfsaild/md126:1366 blocked for more than 120 seconds.
Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:54:55 fs3 kernel: xfsaild/md126 D ffff88b0ae9c26e0 0 1366 2 0x00000000
Jan 18 10:54:55 fs3 kernel: Call Trace:
Jan 18 10:54:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09cd727>] xfs_log_force+0x157/0x2e0 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98adadf0>] ? wake_up_state+0x20/0x20
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09d9a70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09d9c00>] xfsaild+0x190/0x780 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09d9a70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5e61>] kthread+0xd1/0xe0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:54:55 fs3 kernel: [<ffffffff99195ddd>] ret_from_fork_nospec_begin+0x7/0x21
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:54:55 fs3 kernel: INFO: task auditd:103142 blocked for more than 120 seconds.
Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:54:55 fs3 kernel: auditd D ffff88b064280000 0 103142 1 0x00000000
Jan 18 10:54:55 fs3 kernel: Call Trace:
Jan 18 10:54:55 fs3 kernel: [<ffffffff99187480>] ? bit_wait+0x50/0x50
Jan 18 10:54:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:54:55 fs3 kernel: [<ffffffff99186e41>] schedule_timeout+0x221/0x2d0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa7a70>] ? md_handle_request+0xd0/0x150
Jan 18 10:54:55 fs3 kernel: [<ffffffff98b06992>] ? ktime_get_ts64+0x52/0xf0
Jan 18 10:54:55 fs3 kernel: [<ffffffff99187480>] ? bit_wait+0x50/0x50
Jan 18 10:54:55 fs3 kernel: [<ffffffff99188a2d>] io_schedule_timeout+0xad/0x130
Jan 18 10:54:55 fs3 kernel: [<ffffffff99188ac8>] io_schedule+0x18/0x20
Jan 18 10:54:55 fs3 kernel: [<ffffffff99187491>] bit_wait_io+0x11/0x50
Jan 18 10:54:55 fs3 kernel: [<ffffffff99186fb7>] __wait_on_bit+0x67/0x90
Jan 18 10:54:55 fs3 kernel: [<ffffffff98bbd3c1>] wait_on_page_bit+0x81/0xa0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac7010>] ? wake_bit_function+0x40/0x40
Jan 18 10:54:55 fs3 kernel: [<ffffffff98bbd4f1>] __filemap_fdatawait_range+0x111/0x190
Jan 18 10:54:55 fs3 kernel: [<ffffffff98bcb9b1>] ? do_writepages+0x21/0x50
Jan 18 10:54:55 fs3 kernel: [<ffffffff98bbd584>] filemap_fdatawait_range+0x14/0x30
Jan 18 10:54:55 fs3 kernel: [<ffffffff98bbff76>] filemap_write_and_wait_range+0x56/0x90
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09add66>] xfs_file_fsync+0x66/0x1c0 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98c84207>] do_fsync+0x67/0xb0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98c844f0>] SyS_fsync+0x10/0x20
Jan 18 10:54:55 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:54:55 fs3 kernel: INFO: task kworker/10:1:174416 blocked for more than 120 seconds.
Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 18 10:54:55 fs3 kernel: kworker/10:1 D ffff88b0b7af47e0 0 174416 2 0x00000080
Jan 18 10:54:55 fs3 kernel: Workqueue: xfs-sync/md126 xfs_log_worker [xfs]
Jan 18 10:54:55 fs3 kernel: Call Trace:
Jan 18 10:54:55 fs3 kernel: [<ffffffff99189179>] schedule+0x29/0x70
Jan 18 10:54:55 fs3 kernel: [<ffffffff98facb96>] md_flush_request+0x106/0x200
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac6f50>] ? wake_up_atomic_t+0x30/0x30
Jan 18 10:54:55 fs3 kernel: [<ffffffffc05d910b>] raid1_make_request+0x58b/0x5b0 [raid1]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ae6321>] ? put_prev_entity+0x31/0x400
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ae22d9>] ? pick_next_entity+0xa9/0x190
Jan 18 10:54:55 fs3 kernel: [<ffffffff98a2b59e>] ? __switch_to+0xce/0x580
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa7a70>] md_handle_request+0xd0/0x150
Jan 18 10:54:55 fs3 kernel: [<ffffffff98d53dcf>] ? generic_make_request_checks+0x27f/0x390
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa7b69>] md_make_request+0x79/0x190
Jan 18 10:54:55 fs3 kernel: [<ffffffff98d55ba7>] generic_make_request+0x147/0x380
Jan 18 10:54:55 fs3 kernel: [<ffffffff98fa21c6>] ? md_mergeable_bvec+0x46/0x50
Jan 18 10:54:55 fs3 kernel: [<ffffffff98c8bbbb>] ? __bio_add_page+0x20b/0x2b0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98d55e50>] submit_bio+0x70/0x150
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09a8283>] _xfs_buf_ioapply+0x2f3/0x460 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09ca6c7>] ? xlog_bdstrat+0x37/0x70 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09a9c72>] __xfs_buf_submit+0x72/0x250 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09ca6c7>] xlog_bdstrat+0x37/0x70 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09cc526>] xlog_sync+0x2e6/0x3f0 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09cc6ab>] xlog_state_release_iclog+0x7b/0xd0 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09cd80a>] xfs_log_force+0x23a/0x2e0 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98a2b59e>] ? __switch_to+0xce/0x580
Jan 18 10:54:55 fs3 kernel: [<ffffffffc09cd8e6>] xfs_log_worker+0x36/0x100 [xfs]
Jan 18 10:54:55 fs3 kernel: [<ffffffff98abde8f>] process_one_work+0x17f/0x440
Jan 18 10:54:55 fs3 kernel: [<ffffffff98abefa6>] worker_thread+0x126/0x3c0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98abee80>] ? manage_workers.isra.26+0x2a0/0x2a0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5e61>] kthread+0xd1/0xe0
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:54:55 fs3 kernel: [<ffffffff99195ddd>] ret_from_fork_nospec_begin+0x7/0x21
Jan 18 10:54:55 fs3 kernel: [<ffffffff98ac5d90>] ? insert_kthread_work+0x40/0x40
Jan 18 10:54:59 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [zfs:169169]
Jan 18 10:54:59 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:54:59 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:54:59 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:54:59 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:54:59 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:54:59 fs3 kernel: RIP: 0010:[<ffffffffc1334b9c>] [<ffffffffc1334b9c>] dmu_objset_pool+0xc/0x40 [zfs]
Jan 18 10:54:59 fs3 kernel: RSP: 0018:ffff88a50ff57938 EFLAGS: 00000282
Jan 18 10:54:59 fs3 kernel: RAX: ffff88af63d7c000 RBX: 0000000000000010 RCX: 0000000000000001
Jan 18 10:54:59 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffff88ae9bd7e000
Jan 18 10:54:59 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:54:59 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffffffffff10
Jan 18 10:54:59 fs3 kernel: R13: 0000000000000246 R14: 0000000000000246 R15: 0000000000000001
Jan 18 10:54:59 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:54:59 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:54:59 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:54:59 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:54:59 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:54:59 fs3 kernel: PKRU: 55555554
Jan 18 10:54:59 fs3 kernel: Call Trace:
Jan 18 10:54:59 fs3 kernel: [<ffffffffc1432bf7>] zfsvfs_teardown+0x47/0x2e0 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:54:59 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:54:59 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:54:59 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:54:59 fs3 kernel: Code: 48 33 0c 25 28 00 00 00 75 0d 48 83 c4 10 5b 41 5c 41 5d 41 5e 5d c3 e8 43 64 76 d7 0f 1f 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 <48> 85 c0 74 1f 48 8b 80 60 01 00 00 48 85 c0 74 13 48 8b 80 d0
Jan 18 10:55:27 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [zfs:169169]
Jan 18 10:55:27 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:55:27 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:55:27 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:55:27 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:55:27 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:55:27 fs3 kernel: RIP: 0010:[<ffffffffc05e7c02>] [<ffffffffc05e7c02>] taskq_wait_outstanding+0x12/0xf0 [spl]
Jan 18 10:55:27 fs3 kernel: RSP: 0018:ffff88a50ff57920 EFLAGS: 00000246
Jan 18 10:55:27 fs3 kernel: RAX: ffff88b0aa427800 RBX: ffffffffffffff10 RCX: 0000000000000001
Jan 18 10:55:27 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88b0aa427800
Jan 18 10:55:27 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:55:27 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: 0000000000000246
Jan 18 10:55:27 fs3 kernel: R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000246
Jan 18 10:55:27 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:55:27 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:55:27 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:55:27 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:55:27 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:55:27 fs3 kernel: PKRU: 55555554
Jan 18 10:55:27 fs3 kernel: Call Trace:
Jan 18 10:55:27 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:55:27 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:55:27 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:55:27 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:55:27 fs3 kernel: Code: 5c 41 5d 41 5e 5d c3 e8 ed 33 4b d8 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 f5 41 54 <53> 48 89 fb 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 45 d8
Jan 18 10:55:55 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [zfs:169169]
Jan 18 10:55:55 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:55:55 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:55:55 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:55:55 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:55:55 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:55:55 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:55:55 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:55:55 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:55:55 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:55:55 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:55:55 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:55:55 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:55:55 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:55:55 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:55:55 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:55:55 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:55:55 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:55:55 fs3 kernel: PKRU: 55555554
Jan 18 10:55:55 fs3 kernel: Call Trace:
Jan 18 10:55:55 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc1372f4a>] ? dsl_pool_zrele_taskq+0xa/0x10 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:55:55 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:55:55 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:55:55 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:55:55 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Jan 18 10:56:23 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [zfs:169169]
Jan 18 10:56:23 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf
Jan 18 10:56:23 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm
Jan 18 10:56:23 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1
Jan 18 10:56:23 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018
Jan 18 10:56:23 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000
Jan 18 10:56:23 fs3 kernel: RIP: 0010:[<ffffffff9918b795>] [<ffffffff9918b795>] _raw_spin_unlock_irqrestore+0x15/0x20
Jan 18 10:56:23 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246
Jan 18 10:56:23 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001
Jan 18 10:56:23 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
Jan 18 10:56:23 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f
Jan 18 10:56:23 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b
Jan 18 10:56:23 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838
Jan 18 10:56:23 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000
Jan 18 10:56:23 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 10:56:23 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0
Jan 18 10:56:23 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 10:56:23 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 18 10:56:23 fs3 kernel: PKRU: 55555554
Jan 18 10:56:23 fs3 kernel: Call Trace:
Jan 18 10:56:23 fs3 kernel: [<ffffffffc05e7c3d>] taskq_wait_outstanding+0x4d/0xf0 [spl]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc1334ba8>] ? dmu_objset_pool+0x18/0x40 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc1432c09>] zfsvfs_teardown+0x59/0x2e0 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc1433529>] zfs_umount+0x39/0x120 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc14360e6>] zfs_resume_fs+0x106/0x340 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc14033f7>] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc13256f6>] ? dbuf_read+0x3d6/0x570 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc1403ee4>] zfs_ioc_recv_new+0x2b4/0x330 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc05e9b59>] ? spl_vmem_alloc+0x19/0x20 [spl]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc060033f>] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc05fa760>] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc05faa17>] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc140553b>] zfsdev_ioctl_common+0x51b/0x820 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffffc1431106>] zfsdev_ioctl+0x56/0xf0 [zfs]
Jan 18 10:56:23 fs3 kernel: [<ffffffff98c63ab0>] do_vfs_ioctl+0x3a0/0x5b0
Jan 18 10:56:23 fs3 kernel: [<ffffffff98c63d61>] SyS_ioctl+0xa1/0xc0
Jan 18 10:56:23 fs3 kernel: [<ffffffff99195f92>] system_call_fastpath+0x25/0x2a
Jan 18 10:56:23 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48
Of note, the system was receiving a file system at the same time that the remote receiver was trying to pull it. The pull seems to have completed now and I've lost the terminal history getting the above messages so can't provide confirmatory ps output. This could of course be a coincidence. The receive is still running and apparently progressing.
The 'stuck' zfs receive is for a different file system though.
Here's the first 100 lines of the debug messages:
[root@fs3 ~]# head -n 100 /proc/spl/kstat/zfs/dbgmsg
timestamp message
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 arc.c:6045:arc_read(): error 2
1642504001 dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28
I haven't included the zpool history output as write are still ongoing so until these finish I suspect this isn't going to be much use.
Update... The receive into a file system (lets call it fs1) is still progressing and I can see a send of fs1 to the remote backup target. Running strace on the pid of the bash shell pipeline sending this incremental to the backup target hangs (permanently) all terminal sessions and the local console. I'm now rebooting.
I'm reducing the regularity of the remote pull, trying to avoid overlapping to see if that improves reliability.
The suggestion was "please capture all of dbgmsg, and possibly the last few hundred lines of zpool history -i".
In the future, please capture all the lines of dbgmsg, possibly
filtered through uniq
- SET_ERROR makes it even noisier than usual, and
it was already pretty fast scrolling.
I don't know that the fact that it's soft locking up like that is relevant to the more extreme problem of it crashing and burning, though it's obviously not ideal. I'm not surprised that send/recv is a required ingredient, though, that seems to be a common theme.
On Tue, Jan 18, 2022 at 6:10 AM scratchings @.***> wrote:
Hi,
Update on this. With the cron'd syncoids disabled the system was stable. Yesterday afternoon I ran a few syncs interactively and these did complete. This morning I started an interactive recursive sync of a filesystem tree and all was going well but a few minutes ago I started seeing:
kernel:NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169]
every few seconds, with the following in messages:
@.*** ~]# grep kernel /var/log/messages | grep 'Jan 18' Jan 18 10:50:31 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:50:31 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:50:31 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:50:31 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OE ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:50:31 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:50:31 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:50:31 fs3 kernel: RIP: 0010:[
] [ ] _raw_spin_unlock_irqrestore+0x15/0x20 Jan 18 10:50:31 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246 Jan 18 10:50:31 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:50:31 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:50:31 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:50:31 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b Jan 18 10:50:31 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838 Jan 18 10:50:31 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:50:31 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:50:31 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:50:31 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:50:31 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:50:31 fs3 kernel: PKRU: 55555554 Jan 18 10:50:31 fs3 kernel: Call Trace: Jan 18 10:50:31 fs3 kernel: [ ] taskq_wait_outstanding+0x4d/0xf0 [spl] Jan 18 10:50:31 fs3 kernel: [ ] ? dmu_objset_pool+0x18/0x40 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:50:31 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:50:31 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:50:31 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:50:31 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:50:31 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:50:31 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:50:31 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:50:31 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48 Jan 18 10:50:59 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:50:59 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:50:59 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:50:59 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:50:59 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:50:59 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:50:59 fs3 kernel: RIP: 0010:[ ] [ ] _raw_spin_unlock_irqrestore+0x15/0x20 Jan 18 10:50:59 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246 Jan 18 10:50:59 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:50:59 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:50:59 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:50:59 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b Jan 18 10:50:59 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838 Jan 18 10:50:59 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:50:59 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:50:59 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:50:59 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:50:59 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:50:59 fs3 kernel: PKRU: 55555554 Jan 18 10:50:59 fs3 kernel: Call Trace: Jan 18 10:50:59 fs3 kernel: [ ] taskq_wait_outstanding+0x4d/0xf0 [spl] Jan 18 10:50:59 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:50:59 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:50:59 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:50:59 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:50:59 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:50:59 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:50:59 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:50:59 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:50:59 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48 Jan 18 10:51:04 fs3 kernel: INFO: rcu_sched self-detected stall on CPU { 4} (t=60000 jiffies g=7774743 c=7774742 q=278193) Jan 18 10:51:04 fs3 kernel: Task dump for CPU 4: Jan 18 10:51:04 fs3 kernel: zfs R running task 0 169169 169152 0x00000088 Jan 18 10:51:04 fs3 kernel: Call Trace: Jan 18 10:51:04 fs3 kernel: [ ] sched_show_task+0xa8/0x110 Jan 18 10:51:04 fs3 kernel: [ ] dump_cpu_task+0x39/0x70 Jan 18 10:51:04 fs3 kernel: [ ] rcu_dump_cpu_stacks+0x90/0xd0 Jan 18 10:51:04 fs3 kernel: [ ] rcu_check_callbacks+0x442/0x730 Jan 18 10:51:04 fs3 kernel: [ ] ? tick_sched_do_timer+0x50/0x50 Jan 18 10:51:04 fs3 kernel: [ ] update_process_times+0x46/0x80 Jan 18 10:51:04 fs3 kernel: [ ] tick_sched_handle+0x30/0x70 Jan 18 10:51:04 fs3 kernel: [ ] tick_sched_timer+0x39/0x80 Jan 18 10:51:04 fs3 kernel: [ ] __hrtimer_run_queues+0x10e/0x270 Jan 18 10:51:04 fs3 kernel: [ ] hrtimer_interrupt+0xaf/0x1d0 Jan 18 10:51:04 fs3 kernel: [ ] local_apic_timer_interrupt+0x3b/0x60 Jan 18 10:51:04 fs3 kernel: [ ] smp_apic_timer_interrupt+0x43/0x60 Jan 18 10:51:04 fs3 kernel: [ ] apic_timer_interrupt+0x16a/0x170 Jan 18 10:51:04 fs3 kernel: [ ] ? taskq_wait+0xf0/0xf0 [spl] Jan 18 10:51:04 fs3 kernel: [ ] ? zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:51:04 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:51:04 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:51:04 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:51:04 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:51:04 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:51:04 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:51:04 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:51:31 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [zfs:169169] Jan 18 10:51:31 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:51:31 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:51:31 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:51:31 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:51:31 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:51:31 fs3 kernel: RIP: 0010:[ ] [ ] _raw_spin_unlock_irqrestore+0x15/0x20 Jan 18 10:51:31 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246 Jan 18 10:51:31 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:51:31 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:51:31 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:51:31 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b Jan 18 10:51:31 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838 Jan 18 10:51:31 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:51:31 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:51:31 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:51:31 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:51:31 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:51:31 fs3 kernel: PKRU: 55555554 Jan 18 10:51:31 fs3 kernel: Call Trace: Jan 18 10:51:31 fs3 kernel: [ ] taskq_wait_outstanding+0x4d/0xf0 [spl] Jan 18 10:51:31 fs3 kernel: [ ] ? dmu_objset_pool+0x24/0x40 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:51:31 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:51:31 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:51:31 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:51:31 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:51:31 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:51:31 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:51:31 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:51:31 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48 Jan 18 10:51:59 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:51:59 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:51:59 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:51:59 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:51:59 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:51:59 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:51:59 fs3 kernel: RIP: 0010:[ ] [ ] taskq_wait_outstanding+0x9/0xf0 [spl] Jan 18 10:51:59 fs3 kernel: RSP: 0018:ffff88a50ff57938 EFLAGS: 00000246 Jan 18 10:51:59 fs3 kernel: RAX: ffff88b0aa427800 RBX: 0000000000000010 RCX: 0000000000000001 Jan 18 10:51:59 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88b0aa427800 Jan 18 10:51:59 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:51:59 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffffffffff10 Jan 18 10:51:59 fs3 kernel: R13: 0000000000000246 R14: 0000000000000246 R15: 0000000000000001 Jan 18 10:51:59 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:51:59 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:51:59 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:51:59 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:51:59 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:51:59 fs3 kernel: PKRU: 55555554 Jan 18 10:51:59 fs3 kernel: Call Trace: Jan 18 10:51:59 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:51:59 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:51:59 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:51:59 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:51:59 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:51:59 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:51:59 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:51:59 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:51:59 fs3 kernel: Code: 00 75 0d 48 83 c4 30 5b 41 5c 41 5d 41 5e 5d c3 e8 ed 33 4b d8 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 <41> 56 41 55 49 89 f5 41 54 53 48 89 fb 48 83 ec 30 65 48 8b 04 Jan 18 10:52:27 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:52:27 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:52:27 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:52:27 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:52:27 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:52:27 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:52:27 fs3 kernel: RIP: 0010:[ ] [ ] taskq_wait_outstanding+0xce/0xf0 [spl] Jan 18 10:52:27 fs3 kernel: RSP: 0018:ffff88a50ff578e8 EFLAGS: 00000297 Jan 18 10:52:27 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:52:27 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:52:27 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:52:27 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: 0000000000000001 Jan 18 10:52:27 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98ac6f7b R15: ffff88a50ff578d8 Jan 18 10:52:27 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:52:27 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:52:27 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:52:27 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:52:27 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:52:27 fs3 kernel: PKRU: 55555554 Jan 18 10:52:27 fs3 kernel: Call Trace: Jan 18 10:52:27 fs3 kernel: [ ] ? dmu_objset_pool+0x1d/0x40 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:52:27 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:52:27 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:52:27 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:52:27 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:52:27 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:52:27 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:52:27 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:52:27 fs3 kernel: Code: ed 4d d8 48 89 df e8 52 3e ba d8 4c 8b 63 68 48 89 c6 48 89 df e8 d3 3a ba d8 4d 39 e5 73 ce 48 8d 75 b0 4c 89 f7 e8 92 ee 4d d8 <48> 8b 45 d8 65 48 33 04 25 28 00 00 00 75 0d 48 83 c4 30 5b 41 Jan 18 10:52:55 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:52:55 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:52:55 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:52:55 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:52:55 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:52:55 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:52:55 fs3 kernel: RIP: 0010:[ ] [ ] dmu_objset_pool+0x0/0x40 [zfs] Jan 18 10:52:55 fs3 kernel: RSP: 0018:ffff88a50ff57940 EFLAGS: 00000202 Jan 18 10:52:55 fs3 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:52:55 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffff88ae9bd7e000 Jan 18 10:52:55 fs3 kernel: RBP: ffff88a50ff57970 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:52:55 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffffffffff10 Jan 18 10:52:55 fs3 kernel: R13: ffff88a50ff57970 R14: 0000000000000246 R15: 0000000000000001 Jan 18 10:52:55 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:52:55 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:52:55 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:52:55 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:52:55 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:52:55 fs3 kernel: PKRU: 55555554 Jan 18 10:52:55 fs3 kernel: Call Trace: Jan 18 10:52:55 fs3 kernel: [ ] ? zfsvfs_teardown+0x47/0x2e0 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:52:55 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:52:55 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:52:55 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:52:55 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:52:55 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:52:55 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:52:55 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:52:55 fs3 kernel: Code: e8 46 e0 07 00 89 d8 48 8b 4d d8 65 48 33 0c 25 28 00 00 00 75 0d 48 83 c4 10 5b 41 5c 41 5d 41 5e 5d c3 e8 43 64 76 d7 0f 1f 00 <0f> 1f 44 00 00 55 48 8b 07 48 89 e5 48 85 c0 74 1f 48 8b 80 60 Jan 18 10:52:55 fs3 kernel: INFO: task md126_raid1:647 blocked for more than 120 seconds. Jan 18 10:52:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 18 10:52:55 fs3 kernel: md126_raid1 D ffff88b0b01e2100 0 647 2 0x00000000 Jan 18 10:52:55 fs3 kernel: Call Trace: Jan 18 10:52:55 fs3 kernel: [ ] schedule+0x29/0x70 Jan 18 10:52:55 fs3 kernel: [ ] percpu_ref_switch_to_atomic_sync+0x65/0xb0 Jan 18 10:52:55 fs3 kernel: [ ] ? wake_up_atomic_t+0x30/0x30 Jan 18 10:52:55 fs3 kernel: [ ] set_in_sync+0x67/0xe0 Jan 18 10:52:55 fs3 kernel: [ ] md_check_recovery+0x27f/0x500 Jan 18 10:52:55 fs3 kernel: [ ] raid1d+0x51/0x900 [raid1] Jan 18 10:52:55 fs3 kernel: [ ] ? del_timer_sync+0x52/0x60 Jan 18 10:52:55 fs3 kernel: [ ] ? schedule_timeout+0x170/0x2d0 Jan 18 10:52:55 fs3 kernel: [ ] ? requeue_timers+0x170/0x170 Jan 18 10:52:55 fs3 kernel: [ ] md_thread+0x16d/0x1e0 Jan 18 10:52:55 fs3 kernel: [ ] ? wake_up_atomic_t+0x30/0x30 Jan 18 10:52:55 fs3 kernel: [ ] ? find_pers+0x80/0x80 Jan 18 10:52:55 fs3 kernel: [ ] kthread+0xd1/0xe0 Jan 18 10:52:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:52:55 fs3 kernel: [ ] ret_from_fork_nospec_begin+0x7/0x21 Jan 18 10:52:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:52:55 fs3 kernel: INFO: task xfsaild/md126:1366 blocked for more than 120 seconds. Jan 18 10:52:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 18 10:52:55 fs3 kernel: xfsaild/md126 D ffff88b0ae9c26e0 0 1366 2 0x00000000 Jan 18 10:52:55 fs3 kernel: Call Trace: Jan 18 10:52:55 fs3 kernel: [ ] schedule+0x29/0x70 Jan 18 10:52:55 fs3 kernel: [ ] xfs_log_force+0x157/0x2e0 [xfs] Jan 18 10:52:55 fs3 kernel: [ ] ? wake_up_state+0x20/0x20 Jan 18 10:52:55 fs3 kernel: [ ] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] Jan 18 10:52:55 fs3 kernel: [ ] xfsaild+0x190/0x780 [xfs] Jan 18 10:52:55 fs3 kernel: [ ] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] Jan 18 10:52:55 fs3 kernel: [ ] kthread+0xd1/0xe0 Jan 18 10:52:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:52:55 fs3 kernel: [ ] ret_from_fork_nospec_begin+0x7/0x21 Jan 18 10:52:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:52:55 fs3 kernel: INFO: task auditd:103142 blocked for more than 120 seconds. Jan 18 10:52:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 18 10:52:55 fs3 kernel: auditd D ffff88b064280000 0 103142 1 0x00000000 Jan 18 10:52:55 fs3 kernel: Call Trace: Jan 18 10:52:55 fs3 kernel: [ ] ? bit_wait+0x50/0x50 Jan 18 10:52:55 fs3 kernel: [ ] schedule+0x29/0x70 Jan 18 10:52:55 fs3 kernel: [ ] schedule_timeout+0x221/0x2d0 Jan 18 10:52:55 fs3 kernel: [ ] ? md_handle_request+0xd0/0x150 Jan 18 10:52:55 fs3 kernel: [ ] ? ktime_get_ts64+0x52/0xf0 Jan 18 10:52:55 fs3 kernel: [ ] ? bit_wait+0x50/0x50 Jan 18 10:52:55 fs3 kernel: [ ] io_schedule_timeout+0xad/0x130 Jan 18 10:52:55 fs3 kernel: [ ] io_schedule+0x18/0x20 Jan 18 10:52:55 fs3 kernel: [ ] bit_wait_io+0x11/0x50 Jan 18 10:52:55 fs3 kernel: [ ] wait_on_bit+0x67/0x90 Jan 18 10:52:55 fs3 kernel: [ ] wait_on_page_bit+0x81/0xa0 Jan 18 10:52:55 fs3 kernel: [ hrtimer_run_queues+0x10e/0x270 Jan 18 10:54:04 fs3 kernel: [] ? wake_bit_function+0x40/0x40 Jan 18 10:52:55 fs3 kernel: [ ] __filemap_fdatawait_range+0x111/0x190 Jan 18 10:52:55 fs3 kernel: [ ] ? do_writepages+0x21/0x50 Jan 18 10:52:55 fs3 kernel: [ ] filemap_fdatawait_range+0x14/0x30 Jan 18 10:52:55 fs3 kernel: [ ] filemap_write_and_wait_range+0x56/0x90 Jan 18 10:52:55 fs3 kernel: [ ] xfs_file_fsync+0x66/0x1c0 [xfs] Jan 18 10:52:55 fs3 kernel: [ ] do_fsync+0x67/0xb0 Jan 18 10:52:55 fs3 kernel: [ ] SyS_fsync+0x10/0x20 Jan 18 10:52:55 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:53:23 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:53:23 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:53:23 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:53:23 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:53:23 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:53:23 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:53:23 fs3 kernel: RIP: 0010:[ ] [ ] _raw_spin_unlock_irqrestore+0x15/0x20 Jan 18 10:53:23 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246 Jan 18 10:53:23 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:53:23 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:53:23 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:53:23 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b Jan 18 10:53:23 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838 Jan 18 10:53:23 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:53:23 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:53:23 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:53:23 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:53:23 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:53:23 fs3 kernel: PKRU: 55555554 Jan 18 10:53:23 fs3 kernel: Call Trace: Jan 18 10:53:23 fs3 kernel: [ ] taskq_wait_outstanding+0x4d/0xf0 [spl] Jan 18 10:53:23 fs3 kernel: [ ] ? dmu_objset_pool+0x1d/0x40 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:53:23 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:53:23 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:53:23 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:53:23 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:53:23 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:53:23 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:53:23 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:53:23 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48 Jan 18 10:53:51 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:53:51 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:53:51 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:53:51 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:53:51 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:53:51 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:53:51 fs3 kernel: RIP: 0010:[ ] [ ] taskq_wait_outstanding+0xdd/0xf0 [spl] Jan 18 10:53:51 fs3 kernel: RSP: 0018:ffff88a50ff578e8 EFLAGS: 00000246 Jan 18 10:53:51 fs3 kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:53:51 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:53:51 fs3 kernel: RBP: ffff88a50ff57938 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:53:51 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: 0000000000000001 Jan 18 10:53:51 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98ac6f7b R15: ffff88a50ff578d8 Jan 18 10:53:51 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:53:51 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:53:51 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:53:51 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:53:51 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:53:51 fs3 kernel: PKRU: 55555554 Jan 18 10:53:51 fs3 kernel: Call Trace: Jan 18 10:53:51 fs3 kernel: [ ] ? dsl_pool_zrele_taskq+0xd/0x10 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:53:51 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:53:51 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:53:51 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:53:51 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:53:51 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:53:51 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:53:51 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:53:51 fs3 kernel: Code: 48 89 c6 48 89 df e8 d3 3a ba d8 4d 39 e5 73 ce 48 8d 75 b0 4c 89 f7 e8 92 ee 4d d8 48 8b 45 d8 65 48 33 04 25 28 00 00 00 75 0d <48> 83 c4 30 5b 41 5c 41 5d 41 5e 5d c3 e8 f1 32 4b d8 90 0f 1f Jan 18 10:54:04 fs3 kernel: INFO: rcu_sched self-detected stall on CPU { 4} (t=240004 jiffies g=7774743 c=7774742 q=840723) Jan 18 10:54:04 fs3 kernel: Task dump for CPU 4: Jan 18 10:54:04 fs3 kernel: zfs R running task 0 169169 169152 0x00000088 Jan 18 10:54:04 fs3 kernel: Call Trace: Jan 18 10:54:04 fs3 kernel: [ ] sched_show_task+0xa8/0x110 Jan 18 10:54:04 fs3 kernel: [ ] dump_cpu_task+0x39/0x70 Jan 18 10:54:04 fs3 kernel: [ ] rcu_dump_cpu_stacks+0x90/0xd0 Jan 18 10:54:04 fs3 kernel: [ ] rcu_check_callbacks+0x442/0x730 Jan 18 10:54:04 fs3 kernel: [ ] ? tick_sched_do_timer+0x50/0x50 Jan 18 10:54:04 fs3 kernel: [ ] update_process_times+0x46/0x80 Jan 18 10:54:04 fs3 kernel: [ ] tick_sched_handle+0x30/0x70 Jan 18 10:54:04 fs3 kernel: [ ] tick_sched_timer+0x39/0x80 Jan 18 10:54:04 fs3 kernel: [ ] ] hrtimer_interrupt+0xaf/0x1d0 Jan 18 10:54:04 fs3 kernel: [ ] local_apic_timer_interrupt+0x3b/0x60 Jan 18 10:54:04 fs3 kernel: [ ] smp_apic_timer_interrupt+0x43/0x60 Jan 18 10:54:04 fs3 kernel: [ ] apic_timer_interrupt+0x16a/0x170 Jan 18 10:54:04 fs3 kernel: [ ] ? _raw_spin_unlock_irqrestore+0x15/0x20 Jan 18 10:54:04 fs3 kernel: [ ] taskq_wait_outstanding+0x4d/0xf0 [spl] Jan 18 10:54:04 fs3 kernel: [ ] ? dmu_objset_pool+0x1d/0x40 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:54:04 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:54:04 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:54:04 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:54:04 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:54:04 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:54:04 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:54:04 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:54:31 fs3 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [zfs:169169] Jan 18 10:54:31 fs3 kernel: Modules linked in: 8021q garp mrp bonding ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter drbg ansi_cprng dm_crypt iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ses enclosure dm_service_time ipmi_ssif joydev lpc_ich sg i2c_i801 mei_me mei wmi ipmi_si ipmi_devintf Jan 18 10:54:31 fs3 kernel: ipmi_msghandler dm_multipath dm_mod acpi_pad acpi_power_meter binfmt_misc ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) raid1 sd_mod crc_t10dif crct10dif_generic ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mpt3sas drm libahci i40e crct10dif_pclmul crct10dif_common crc32c_intel libata raid_class scsi_transport_sas ptp pps_core drm_panel_orientation_quirks nfit libnvdimm Jan 18 10:54:31 fs3 kernel: CPU: 4 PID: 169169 Comm: zfs Kdump: loaded Tainted: P OEL ------------ 3.10.0-1160.42.2.el7.x86_64 #1 Jan 18 10:54:31 fs3 kernel: Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 2.1 06/14/2018 Jan 18 10:54:31 fs3 kernel: task: ffff88b03ce05280 ti: ffff88a50ff54000 task.ti: ffff88a50ff54000 Jan 18 10:54:31 fs3 kernel: RIP: 0010:[ ] [ ] _raw_spin_unlock_irqrestore+0x15/0x20 Jan 18 10:54:31 fs3 kernel: RSP: 0018:ffff88a50ff578d8 EFLAGS: 00000246 Jan 18 10:54:31 fs3 kernel: RAX: 0000000000000246 RBX: 0000000000000001 RCX: 0000000000000001 Jan 18 10:54:31 fs3 kernel: RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246 Jan 18 10:54:31 fs3 kernel: RBP: ffff88a50ff578d8 R08: 0000000000000101 R09: 000000018040003f Jan 18 10:54:31 fs3 kernel: R10: 0000000000000001 R11: ffff88aef4509e80 R12: ffffffff98ac6f7b Jan 18 10:54:31 fs3 kernel: R13: ffff88a50ff578d8 R14: ffffffff98adae02 R15: ffff88a50ff57838 Jan 18 10:54:31 fs3 kernel: FS: 00007f5a2a4937c0(0000) GS:ffff88b0bbd00000(0000) knlGS:0000000000000000 Jan 18 10:54:31 fs3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 10:54:31 fs3 kernel: CR2: 00007f1c6209a000 CR3: 0000000e037ec000 CR4: 00000000007607e0 Jan 18 10:54:31 fs3 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 10:54:31 fs3 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 18 10:54:31 fs3 kernel: PKRU: 55555554 Jan 18 10:54:31 fs3 kernel: Call Trace: Jan 18 10:54:31 fs3 kernel: [ ] taskq_wait_outstanding+0x4d/0xf0 [spl] Jan 18 10:54:31 fs3 kernel: [ ] ? dsl_pool_zrele_taskq+0xa/0x10 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] zfsvfs_teardown+0x59/0x2e0 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] zfs_umount+0x39/0x120 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] zfs_resume_fs+0x106/0x340 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] zfs_ioc_recv_impl+0xa57/0xfd0 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] ? dbuf_read+0x3d6/0x570 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] zfs_ioc_recv_new+0x2b4/0x330 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] ? spl_vmem_alloc+0x19/0x20 [spl] Jan 18 10:54:31 fs3 kernel: [ ] ? nv_alloc_sleep_spl+0x1f/0x30 [znvpair] Jan 18 10:54:31 fs3 kernel: [ ] ? nv_mem_zalloc.isra.13+0x30/0x40 [znvpair] Jan 18 10:54:31 fs3 kernel: [ ] ? nvlist_xalloc.part.14+0x97/0x190 [znvpair] Jan 18 10:54:31 fs3 kernel: [ ] zfsdev_ioctl_common+0x51b/0x820 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] zfsdev_ioctl+0x56/0xf0 [zfs] Jan 18 10:54:31 fs3 kernel: [ ] do_vfs_ioctl+0x3a0/0x5b0 Jan 18 10:54:31 fs3 kernel: [ ] SyS_ioctl+0xa1/0xc0 Jan 18 10:54:31 fs3 kernel: [ ] system_call_fastpath+0x25/0x2a Jan 18 10:54:31 fs3 kernel: Code: 07 00 0f 1f 40 00 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 48 Jan 18 10:54:55 fs3 kernel: INFO: task md126_raid1:647 blocked for more than 120 seconds. Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 18 10:54:55 fs3 kernel: md126_raid1 D ffff88b0b01e2100 0 647 2 0x00000000 Jan 18 10:54:55 fs3 kernel: Call Trace: Jan 18 10:54:55 fs3 kernel: [ ] schedule+0x29/0x70 Jan 18 10:54:55 fs3 kernel: [ ] percpu_ref_switch_to_atomic_sync+0x65/0xb0 Jan 18 10:54:55 fs3 kernel: [ ] ? wake_up_atomic_t+0x30/0x30 Jan 18 10:54:55 fs3 kernel: [ ] set_in_sync+0x67/0xe0 Jan 18 10:54:55 fs3 kernel: [ ] md_check_recovery+0x27f/0x500 Jan 18 10:54:55 fs3 kernel: [ ] raid1d+0x51/0x900 [raid1] Jan 18 10:54:55 fs3 kernel: [ ] ? del_timer_sync+0x52/0x60 Jan 18 10:54:55 fs3 kernel: [ ] ? schedule_timeout+0x170/0x2d0 Jan 18 10:54:55 fs3 kernel: [ ] ? requeue_timers+0x170/0x170 Jan 18 10:54:55 fs3 kernel: [ ] md_thread+0x16d/0x1e0 Jan 18 10:54:55 fs3 kernel: [ ] ? wake_up_atomic_t+0x30/0x30 Jan 18 10:54:55 fs3 kernel: [ ] ? find_pers+0x80/0x80 Jan 18 10:54:55 fs3 kernel: [ ] kthread+0xd1/0xe0 Jan 18 10:54:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:54:55 fs3 kernel: [ ] ret_from_fork_nospec_begin+0x7/0x21 Jan 18 10:54:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:54:55 fs3 kernel: INFO: task xfsaild/md126:1366 blocked for more than 120 seconds. Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 18 10:54:55 fs3 kernel: xfsaild/md126 D ffff88b0ae9c26e0 0 1366 2 0x00000000 Jan 18 10:54:55 fs3 kernel: Call Trace: Jan 18 10:54:55 fs3 kernel: [ ] schedule+0x29/0x70 Jan 18 10:54:55 fs3 kernel: [ ] xfs_log_force+0x157/0x2e0 [xfs] Jan 18 10:54:55 fs3 kernel: [ ] ? wake_up_state+0x20/0x20 Jan 18 10:54:55 fs3 kernel: [ ] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] Jan 18 10:54:55 fs3 kernel: [ ] xfsaild+0x190/0x780 [xfs] Jan 18 10:54:55 fs3 kernel: [ ] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] Jan 18 10:54:55 fs3 kernel: [ ] kthread+0xd1/0xe0 Jan 18 10:54:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:54:55 fs3 kernel: [ ] ret_from_fork_nospec_begin+0x7/0x21 Jan 18 10:54:55 fs3 kernel: [ ] ? insert_kthread_work+0x40/0x40 Jan 18 10:54:55 fs3 kernel: INFO: task auditd:103142 blocked for more than 120 seconds. Jan 18 10:54:55 fs3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 18 10:54:55 fs3 kernel: auditd D ffff88b064280000 0 103142 1 0x00000000 Jan 18 10:54:55 fs3 kernel: Call Trace: Jan 18 10:54:55 fs3 kernel: [ ] ? bit_wait+0x50/0x50 Jan 18 10:54:55 fs3 kernel: [ ] schedule+0x29/0x70 Jan 18 10:54:55 fs3 kernel: [ ] schedule_timeout+0x221/0x2d0 Jan 18 10:54:55 fs3 kernel: [ ] ? md_handle_request+0xd0/0x150 Jan 18 10:54:55 fs3 kernel: [ ] ? ktime_get_ts64+0x52/0xf0 Jan 18 10:54:55 fs3 kernel: [ ] ? bit_wait+0x50/0x50 Jan 18 10:54:55 fs3 kernel: [ ] io_schedule_timeout+0xad/0x130 Jan 18 10:54:55 fs3 kernel: [ ] io_schedule+0x18/0x20 Jan 18 10:54:55 fs3 kernel: [
System information
Describe the problem you're observing
When I start sending raw ZFS snapshots to a different system, my Linux systen (4.19.0-14-amd64) starts to hang completely. I can ping it, I can start a very commands (such as dmesg) but most commands hang (incl zfs, zpool, htop, ps, ...). The entire systems hangs completely.
Dmesg shows the following entries at the time of the occurance:
Interestingly, the transfer continues happily but just everything else in the system hangs. The only way to recover is resetting the machine (since not even reboot works).
Describe how to reproduce the problem
It's a tough one. It seems to me that the issue might be load related in some sense since it only occurs if I have two zfs send's (via syncoid) running in parallel that have to do with encrypted datasets.
Transfer 1
The first one sends datasets from an unecrypted dataset into an encrypted one (I migrate to encryption).
I use syncoid and use the command:
syncoid -r --skip-parent --no-sync-snap zpradix1imain/sys/vz zpradix1imain/sys/vz_enc
This translates into
zfs send -I 'zpradix1imain/sys/vz/main'@'zfs-auto-snap_hourly-2021-03-02-1917' 'zpradix1imain/sys/vz/main'@'zfs-auto-snap_frequent-2021-03-02-1932' | mbuffer -q -s 128k -m 16M 2>/dev/null | pv -s 16392592 | zfs receive -s -F 'zpradix1imain/sys/vz_enc/main'
Transfer 2
I transfer data from an encrypted dataset raw to a secondary server. The syncoid command is:
syncoid -r --skip-parent --no-sync-snap --sendoptions=w --exclude=zfs-auto-snap_hourly --exclude=zfs-auto-snap_frequent zpradix1imain/data root@192.168.200.12:zpzetta/radix/data
This translates into:
zfs send -w 'zpradix1imain/data/home'@'vicari-prev' | pv -s 179222507064 | lzop | mbuffer -q -s 128k -m 16M 2>/dev/null | ssh ...
In summary: