BUG: unable to handle kernel NULL pointer dereference

openzfs / zfs

OpenZFS on Linux and FreeBSD

https://openzfs.github.io/openzfs-docs

Other

10.44k stars 1.73k forks source link

BUG: unable to handle kernel NULL pointer dereference #5226

Closed tuomari closed 6 years ago

tuomari commented 7 years ago

Running linux 4.4.19 with following ZFS versions

ZFS 0.6.5-415_g2bce804
SPL 0.6.5-68_g576865b

I am currently replacing a faulty drive and my pool is set up as follows:

  pool: tank
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Sep 28 15:40:08 2016
    6.55T scanned out of 30.4T at 11.2M/s, 618h36m to go
    534G resilvered, 21.53% done
config:

    NAME                                                       STATE     READ WRITE CKSUM
    tank                                                       ONLINE       0     0     0
      raidz1-0                                                 ONLINE       0     0     0
        spare-0                                                ONLINE       0     0     0
          ata-ST3000DM001-1CH166_Z1F38DJ3-part4                ONLINE       0     0     0
          ata-ST3000DM001-9YN166_W1F3AEZG                      ONLINE       0     0     0  (resilvering)
        ata-ST3000DM001-1CH166_Z1F38MTB-part4                  ONLINE       0     0     0
        ata-ST3000DM001-1ER166_Z5009QEX                        ONLINE       0     0     0
        ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N0HRLESN-part4         ONLINE       0     0     0
        ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N2SED25L-part4         ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F4CKB1                        ONLINE       0     0     0
      raidz1-1                                                 ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F4REQ1-part4                  ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F4S6CX-part4                  ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F3XC4M-part4                  ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F4S822-part4                  ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F4S6D9-part4                  ONLINE       0     0     0
        ata-ST3000DM001-1CH166_Z1F4S7Z6-part4                  ONLINE       0     0     0
    logs
      mirror-2                                                 ONLINE       0     0     0
        ata-KINGSTON_SH103S3120G_50026B7334000234-part4        ONLINE       0     0     0
        ata-Kingston_SSDNow_V_Series_128GB_06P970009771-part4  ONLINE       0     0     0
    cache
      ata-KINGSTON_SH103S3120G_50026B7334000234-part5          ONLINE       0     0     0
      ata-Kingston_SSDNow_V_Series_128GB_06P970009771-part5    ONLINE       0     0     0
    spares
      ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N3ELL01X                 AVAIL   
      ata-ST3000DM001-9YN166_W1F3AEZG                          INUSE     currently in use

errors: No known data errors

[386646.743181] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[386646.743291] IP: [<ffffffffa0b57063>] buf_hash_remove+0x63/0xc0 [zfs]
[386646.743402] PGD 0 
[386646.743451] Oops: 0000 [#1] SMP 
[386646.743515] Modules linked in: nls_utf8 hfsplus hfs vfat msdos fat jfs xfs binfmt_misc 8021q garp stp llc openvswitch nf_defrag_ipv6 nf_conntrack nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc zfs(PO) zunicode(PO) zavl(PO) iTCO_wdt gpio_ich iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper bttv ast tveeprom videobuf_dma_sg tea575x serio_raw ttm videobuf_core pcspkr rc_core drm_kms_helper v4l2_common syscopyarea sysfillrect sysimgblt videodev fb_sys_fops snd_hda_intel media drm snd_hda_codec agpgart snd_hda_core lpc_ich i2c_i801 mfd_core snd_pcm snd_timer snd soundcore ioatdma i7core_edac dca i5500_temp edac_core zcommon(PO) znvpair(PO) spl(O) btrfs shpchp ipmi_si
[386646.744323]  ipmi_msghandler evdev acpi_cpufreq tpm_tis tpm w83795 w83627ehf hwmon_vid coretemp autofs4 ohci_hcd hid_generic usbhid hid crc32c_intel psmouse ehci_pci uhci_hcd ehci_hcd usbcore usb_common sg sd_mod
[386646.744547] CPU: 2 PID: 831 Comm: l2arc_feed Tainted: P           O    4.5.5.remod.kvm.ovs.1 #13
[386646.744665] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
[386646.744755] task: ffff88082bebe780 ti: ffff880824150000 task.ti: ffff880824150000
[386646.744872] RIP: 0010:[<ffffffffa0b57063>]  [<ffffffffa0b57063>] buf_hash_remove+0x63/0xc0 [zfs]
[386646.747767] RSP: 0018:ffff880824153cc0  EFLAGS: 00010207
[386646.747814] RAX: 0000000000155ef6 RBX: ffffffffa0d83900 RCX: 0000000000000000
[386646.747925] RDX: 0000000000000000 RSI: ffff88006c191010 RDI: ffff88006c191000
[386646.748028] RBP: ffff880824153cc0 R08: 00b1b65f37241352 R09: 00b1b65f363f54af
[386646.748136] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88006c191000
[386646.748219] R13: ffffffffa0d83b80 R14: 0000000000000000 R15: 0000000000000000
[386646.748329] FS:  0000000000000000(0000) GS:ffff88042fc80000(0000) knlGS:0000000000000000
[386646.748455] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[386646.748526] CR2: 0000000000000020 CR3: 0000000001c0a000 CR4: 00000000000006e0
[386646.748611] Stack:
[386646.748661]  ffff880824153d10 ffffffffa0b588c5 ffff8800362553b0 ffff880036255380
[386646.748780]  ffff880036255030 0000000000001ef6 ffff88007215af70 00000000000f7b00
[386646.748895]  ffffffffa0d7b150 ffff88006c191000 ffff880824153d80 ffffffffa0b5a953
[386646.748985] Call Trace:
[386646.749058]  [<ffffffffa0b588c5>] arc_change_state.isra.15+0x235/0x290 [zfs]
[386646.749151]  [<ffffffffa0b5a953>] l2arc_evict+0x263/0x2e0 [zfs]
[386646.749233]  [<ffffffffa0b5add5>] l2arc_feed_thread+0x405/0xc70 [zfs]
[386646.749319]  [<ffffffffa0b5a9d0>] ? l2arc_evict+0x2e0/0x2e0 [zfs]
[386646.749380]  [<ffffffffa0162285>] thread_generic_wrapper+0x75/0x90 [spl]
[386646.749453]  [<ffffffffa0162210>] ? __thread_exit+0x20/0x20 [spl]
[386646.749530]  [<ffffffff810bb0d6>] kthread+0xd6/0xf0
[386646.749586]  [<ffffffff811c7999>] ? ____fput+0x9/0x10
[386646.749660]  [<ffffffff810bb000>] ? kthread_create_on_node+0x190/0x190
[386646.749713]  [<ffffffff8167e45f>] ret_from_fork+0x3f/0x70
[386646.749767]  [<ffffffff810bb000>] ? kthread_create_on_node+0x190/0x190
[386646.749837] Code: 75 e0 49 c1 e9 08 48 8b 15 03 c6 12 00 4d 31 c8 4c 31 c0 48 23 05 ee c5 12 00 48 8d 14 c2 48 8b 0a 48 39 f9 75 05 eb 10 48 89 d1 <48> 8b 51 20 48 39 fa 75 f4 48 8d 51 20 48 8b 4f 20 48 89 0a 81 
[386646.750152] RIP  [<ffffffffa0b57063>] buf_hash_remove+0x63/0xc0 [zfs]
[386646.750237]  RSP <ffff880824153cc0>
[386646.750286] CR2: 0000000000000020
[386646.750627] ---[ end trace 29c98269ed7b57dd ]---

tuomari commented 7 years ago

Got another null pointer. This time with the following ZFS version:

Kernel version: 4.7.6
ZFS: Loaded module v0.7.0-rc1_74_ge2c292b, ZFS pool version 5000, ZFS filesystem version 5
SPL: Loaded module v0.7.0-rc1_6_g341dfdb

[702839.040104] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[702839.040146] IP: [<ffffffffa0879063>] buf_hash_remove+0x63/0xb0 [zfs]
[702839.040221] PGD 0 
[702839.040236] Oops: 0000 [#1] SMP
[702839.040251] Modules linked in: zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) nls_utf8 hfsplus hfs vfat msdos fat jfs xfs binfmt_misc 8021q garp stp llc openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc iTCO_wdt gpio_ich iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper bttv ast pcspkr serio_raw tveeprom videobuf_dma_sg ttm tea575x videobuf_core drm_kms_helper rc_core syscopyarea v4l2_common sysfillrect sysimgblt fb_sys_fops videodev media drm agpgart lpc_ich mfd_core i2c_i801 snd_hda_intel snd_hda_codec snd_hda_core snd_pcm snd_timer snd soundcore
[702839.040730]  ioatdma dca i5500_temp i7core_edac edac_core btrfs shpchp ipmi_si ipmi_msghandler evdev acpi_cpufreq tpm_tis tpm w83795 w83627ehf hwmon_vid coretemp autofs4 hid_generic usbhid hid crc32c_intel psmouse ehci_pci uhci_hcd ehci_hcd usbcore usb_common sg sd_mod
[702839.040886] CPU: 1 PID: 1245 Comm: l2arc_feed Tainted: P           O    4.7.6.remod.kvm.ovs.1 #2
[702839.040927] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
[702839.041051] task: ffff88041d7d9700 ti: ffff8803cf534000 task.ti: ffff8803cf534000
[702839.041167] RIP: 0010:[<ffffffffa0879063>]  [<ffffffffa0879063>] buf_hash_remove+0x63/0xb0 [zfs]
[702839.041288] RSP: 0018:ffff8803cf537ce8  EFLAGS: 00010207
[702839.041359] RAX: 00000000003e7bf7 RBX: ffffffffa0aa6b00 RCX: 0000000000000000
[702839.041467] RDX: 0000000000000000 RSI: ffff880099720010 RDI: ffff880099720000
[702839.041583] RBP: ffff8803cf537ce8 R08: 0092c424cf88a137 R09: 0092c424ce932846
[702839.041667] R10: ffff88042fc59b00 R11: ffffea0015e9c0c0 R12: ffff880099720000
[702839.041789] R13: ffffffffa0aa6740 R14: 0000000000000000 R15: 0000000000000000
[702839.041897] FS:  0000000000000000(0000) GS:ffff88042fc40000(0000) knlGS:0000000000000000
[702839.042007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[702839.042063] CR2: 0000000000000020 CR3: 0000000001c06000 CR4: 00000000000006e0
[702839.042171] Stack:
[702839.042261]  ffff8803cf537d28 ffffffffa087aadb ffffffffa0a50e90 0000000000001bf7
[702839.042370]  ffff8805c48c50c0 00000000000dfb80 ffffffffa0a861d0 ffff880099720000
[702839.042512]  ffff8803cf537d98 ffffffffa087e20f ffff88039221f7b0 ffff880262cb4650
[702839.042632] Call Trace:
[702839.042706]  [<ffffffffa087aadb>] arc_change_state.isra.19+0x31b/0x390 [zfs]
[702839.042791]  [<ffffffffa087e20f>] l2arc_evict+0x25f/0x2e0 [zfs]
[702839.042901]  [<ffffffffa087e677>] l2arc_feed_thread+0x3e7/0xaa0 [zfs]
[702839.042994]  [<ffffffffa087e290>] ? l2arc_evict+0x2e0/0x2e0 [zfs]
[702839.043070]  [<ffffffffa0775285>] thread_generic_wrapper+0x75/0x90 [spl]
[702839.043152]  [<ffffffffa0775210>] ? __thread_exit+0x20/0x20 [spl]
[702839.043242]  [<ffffffff810c0b98>] kthread+0xe8/0x100
[702839.043306]  [<ffffffff816ad30f>] ret_from_fork+0x1f/0x40
[702839.043371]  [<ffffffff810c0ab0>] ? kthread_create_on_node+0x190/0x190
[702839.043451] Code: 75 e0 49 c1 e9 08 48 8b 15 03 d6 12 00 4d 31 c8 4c 31 c0 48 23 05 ee d5 12 00 48 8d 14 c2 48 8b 0a 48 39 f9 75 05 eb 10 48 89 d1 <48> 8b 51 20 48 39 fa 75 f4 48 8d 51 20 48 8b 4f 20 48 89 0a 83 
[702839.043762] RIP  [<ffffffffa0879063>] buf_hash_remove+0x63/0xb0 [zfs]
[702839.043854]  RSP <ffff8803cf537ce8>
[702839.043904] CR2: 0000000000000020
[702839.044239] ---[ end trace 8ec91d55103384f8 ]---

behlendorf commented 7 years ago

@tuomari can you fairly consistently reproduce this? If so can you describe the workload.

The failure from the stack clearly involves the l2arc device. So if you need to immediately resolve this issue you could probably disable the l2arc device for the time being.

However, if you're in a position where you could enable debugging on the system in question that might help us identify the root cause. You'll just need to pass the --enable-debug flag when building the spl and zfs code.

For future reference:

$ gdb module/zfs/zfs.ko
...
Reading symbols from /g/g0/behlendo/src/git/zfs/module/zfs/zfs.ko...done.
(gdb) list *(buf_hash_remove+0x63)
0x93 is in buf_hash_remove (/g/g0/behlendo/src/git/zfs/module/zfs/arc.c:1102).
1097    
1098        ASSERT(MUTEX_HELD(BUF_HASH_LOCK(idx)));
1099        ASSERT(HDR_IN_HASH_TABLE(hdr));
1100    
1101        hdrp = &buf_hash_table.ht_table[idx];
1102        while ((fhdr = *hdrp) != hdr) {
1103            ASSERT3P(fhdr, !=, NULL);
1104            hdrp = &fhdr->b_hash_next;
1105        }
1106        *hdrp = hdr->b_hash_next;

behlendorf commented 7 years ago

Likely a duplicate of #5071

tuomari commented 7 years ago

It seems to pop up every week, or so. I will gladly enable debugging.

My payload is mostly generated by Zoneminder, which creates a lot of small files. I have 11 cameras and each camera generates around ~1GB and 6000 files per 10 minute period, so in total roughly 18MB/s and 110 files/s. While the system tries to keep disk usage all the time at 95%, it generates about the same amount of deletes as it has writes.

The system is also a netatalk server and stores timemachine backups for ~10 machines and receives offsite zfs snapshots nightly.

behlendorf commented 7 years ago

I will gladly enable debugging.

Great, that should help. Give what you've reported thus far I'd expect you to hit one of the ASSERTs.

While the system tries to keep disk usage all the time at 95%

You mean in term of capacity correct?

tuomari commented 7 years ago

You mean in term of capacity correct?

Yes.

I hit this, early morning. I'll gladly help in any way I can to resolve this.

SPL: Loaded module v0.7.0-rc1_8_g0d26756 (DEBUG mode)
ZFS: Loaded module v0.7.0-rc1_73_g5cc78dc (DEBUG mode), ZFS pool version 5000, ZFS filesystem version 5

[Tue Oct 18 03:29:32 2016] VERIFY(refcount_is_zero( &exists->b_l1hdr.b_refcnt)) failed
[Tue Oct 18 03:29:32 2016] PANIC at arc.c:5788:arc_write_done()
[Tue Oct 18 03:29:32 2016] Showing stack for process 1213
[Tue Oct 18 03:29:32 2016] CPU: 5 PID: 1213 Comm: z_wr_int_1 Tainted: P           O    4.7.6.remod.kvm.ovs.1 #2
[Tue Oct 18 03:29:32 2016] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
[Tue Oct 18 03:29:32 2016]  0000000000000000 ffff880426f0fa10 ffffffff8131012b ffffffffa0de5064
[Tue Oct 18 03:29:32 2016]  000000000000169c ffff880426f0fa20 ffffffffa03c4a5f ffff880426f0fba0
[Tue Oct 18 03:29:32 2016]  ffffffffa03c4b17 ffff88082a1f2180 0000000000000028 ffff880426f0fbb0
[Tue Oct 18 03:29:32 2016] Call Trace:
[Tue Oct 18 03:29:32 2016]  [<ffffffff8131012b>] dump_stack+0x4d/0x72
[Tue Oct 18 03:29:32 2016]  [<ffffffffa03c4a5f>] spl_dumpstack+0x3f/0x50 [spl]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa03c4b17>] spl_panic+0xa7/0xd0 [spl]
[Tue Oct 18 03:29:32 2016]  [<ffffffff81431eca>] ? scsi_request_fn+0x3a/0x650
[Tue Oct 18 03:29:32 2016]  [<ffffffff816aaef9>] ? __mutex_unlock_slowpath+0xa9/0x140
[Tue Oct 18 03:29:32 2016]  [<ffffffff816ab305>] ? mutex_lock+0x15/0x30
[Tue Oct 18 03:29:32 2016]  [<ffffffffa0c9634e>] arc_write_done+0x6ce/0x790 [zfs]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa0d82dee>] zio_done+0x38e/0x17f0 [zfs]
[Tue Oct 18 03:29:32 2016]  [<ffffffff811bb7cc>] ? kfree+0x14c/0x160
[Tue Oct 18 03:29:32 2016]  [<ffffffffa03bcd15>] ? spl_kmem_free+0x25/0x30 [spl]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa0d835f6>] zio_done+0xb96/0x17f0 [zfs]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa0d835f6>] zio_done+0xb96/0x17f0 [zfs]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa0d835f6>] zio_done+0xb96/0x17f0 [zfs]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa0d79fae>] zio_execute+0xee/0x2e0 [zfs]
[Tue Oct 18 03:29:32 2016]  [<ffffffffa03c1fe9>] taskq_thread+0x299/0x590 [spl]
[Tue Oct 18 03:29:32 2016]  [<ffffffff810c9fb0>] ? wake_up_q+0x70/0x70
[Tue Oct 18 03:29:32 2016]  [<ffffffffa03c1d50>] ? taskq_thread_should_stop+0x80/0x80 [spl]
[Tue Oct 18 03:29:32 2016]  [<ffffffff810c0b98>] kthread+0xe8/0x100
[Tue Oct 18 03:29:32 2016]  [<ffffffff816ad30f>] ret_from_fork+0x1f/0x40
[Tue Oct 18 03:29:32 2016]  [<ffffffff810c0ab0>] ? kthread_create_on_node+0x190/0x190

dweeezil commented 7 years ago

@tuomari Are you using large blocks? This trace and the fact that you're keeping the pool very full point at a problem with syncing gang blocks. As I mentioned in #5071, it sounds like these problems started after you upgraded from some older version of ZoL. If so, what was the older version? I'll try running some gang block stress tests.

tuomari commented 7 years ago

Are you using large blocks?

No (I think). enabled but not active, tank feature@large_blocks enabled local

Sorry, I don't have specific version when this started. I had some problems before, which caused the system to crash almost daily, and I kept upgrading zfs on almost every boot, so I have no specific version when this problem started.

dweeezil commented 7 years ago

@tuomari You're correct, "enabled" means you're not using any large blocks. The reason I asked is because they'd amplify the number of gang blocks needed to satisfy allocations on a fragmented and/or almost full pool. Going under the assumption this is the same or is related to #5071, the ingredients seem to be:

almost full pool; likely lots of gang blocks
l2arc in use
newer ZoL code base (>= 0.6.5?)

tuomari commented 7 years ago

@dweeezil Yes. this is the same system as in #5071, although I have not observed any GPF:s lately. Is there anything I could do that would help in pinning down this problem? Btw, does xattr=sa cause gang blocks?

dweeezil commented 7 years ago

@tuomari No, xattr=sa would not. Your stack trace, combined with the rather unique workload in which you're keeping the pool very full and doing lots of allocations and freeing would definitely stress the handling of gang blocks. I've performed some rather long-running ztest instances with -g 513 and those haven't caught any issues.

I'd be curious to know how many gang blocks your pool has and, much more interestingly, how deeply they're nested. Gang blocks can theoretically be arbitrarily nested and it turns out that causes recursion in the zio pipeline when running zio_gang_tree_assemble() (in addition to the normal recursion which might happen). Since you're running a newer kernel which has 16KiB stacks, ZoL isn't taking any special pains to reduce stack depth in the pipeline.

I've added a few hacks to my dweeezil:zdb branch. It adds 2 features to the -b option: first, the periodic output will display the number of gang blocks and, in brackets, the maximum depth of gang blocks seen so far. Next, the final output will display the maximum depth. It would be very interesting if you could run this on your pool for awhile and confirm that there are a lot of gang blocks and also, more interestingly, see what the maximum depth is. You can build this standalone and run zdb from the build area as cmd/zdb/zdb -b <poolname> without installing it. Since your pool is pretty large, I don't expect you'd want to run it to completion, however, it will generate data on-the-fly. I'd consider any nesting depth of 3 or higher to be very unusual.

tuomari commented 7 years ago

I started running zdb, and will report after it has run for a while, or anything intresting shows up.

I'm not sure if there are related to this, or not, but I got two asserts on zdb with @dweeezil zdb branch. I first started zdb -b tank without screen, and ctrl+c:d it. After that I started hitting asserts, and zdb started correctly only after reboot.

loading space map for vdev 0 of 3, metaslab 87 of 128 ...
hdr->b_l1hdr.b_bufcnt == 1 (0x13 == 0x1)
ASSERT at ../../module/zfs/arc.c:3017:arc_buf_destroy()
Aborted

loading space map for vdev 0 of 3, metaslab 50 of 128 ...
space_map_load(msp->ms_sm, msp->ms_tree, SM_ALLOC) == 0 (0x34 == 0x0)
ASSERT at zdb.c:2677:zdb_leak_init()
Aborted

dweeezil commented 7 years ago

@tuomari Traversing a live pool, particularly one that's close to full and in use, is going to be hit or miss. I was hoping it would run long enough that you could get some data from the new gang block output in the per-second status reports.

tuomari commented 7 years ago

@dweeezil: zdb is still running, and I will keep it running for the time being. Here are the current results:

gang=0[0] 1.01T completed (  19MB/s) estimated time remaining: 440hr 12min 11sec

I will report back periodically, or if anything intresting pops up.

dweeezil commented 7 years ago

@tuomari It's traversing very slowly. No gang block so far, either. There definitely are some, however, it may be impractical to ever find them based on how large your pool is. It might be a better idea to put the gang depth tracking patch into the kernel module and expose it as a per-pool kstat. In any case, we're looking for nonzero values where it says "0[0]". The first zero is the number of gang blocks and the second is the maximum nesting depth seen so far.

tuomari commented 7 years ago

@dweeezil I hit another assert. Is this a known problem? If running zdb gives you anything intresting, I have no problem running these for extended periods of time.

gang=0[0] 5.08T completed (  17MB/s) estimated time remaining: 433hr 58min 19sec
dmu_bonus_hold(os, object, dl, &dl->dl_dbuf) (0x0 == 0x34)
ASSERT at ../../module/zfs/dsl_deadlist.c:99:dsl_deadlist_open()
Aborted

tuomari commented 7 years ago

I also hit an assert today.

zfs version: 0.7.0-rc1_73_g5cc78dc spl version: 0.7.0-rc1_8_g0d26756 kernel: 4.7.6

Sat Oct 29 00:23:34 2016] VERIFY(zio->io_children[c][w] == 0) failed
[Sat Oct 29 00:23:34 2016] PANIC at zio.c:3326:zio_done()
[Sat Oct 29 00:23:34 2016] Showing stack for process 21555
[Sat Oct 29 00:23:34 2016] CPU: 1 PID: 21555 Comm: zfs Tainted: P           O    4.7.6.remod.kvm.ovs.1 #2
[Sat Oct 29 00:23:34 2016] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
[Sat Oct 29 00:23:34 2016]  0000000000000000 ffff8801b02fb550 ffffffff8131012b ffffffffa118159c
[Sat Oct 29 00:23:34 2016]  0000000000000cfe ffff8801b02fb560 ffffffffa01d3a5f ffff8801b02fb6e0
[Sat Oct 29 00:23:34 2016]  ffffffffa01d3b17 0000000000000000 ffff880400000028 ffff8801b02fb6f0
[Sat Oct 29 00:23:34 2016] Call Trace:
[Sat Oct 29 00:23:34 2016]  [<ffffffff8131012b>] dump_stack+0x4d/0x72
[Sat Oct 29 00:23:34 2016]  [<ffffffffa01d3a5f>] spl_dumpstack+0x3f/0x50 [spl]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa01d3b17>] spl_panic+0xa7/0xd0 [spl]
[Sat Oct 29 00:23:34 2016]  [<ffffffff816aceb9>] ? _raw_spin_unlock_irq+0x9/0x10
[Sat Oct 29 00:23:34 2016]  [<ffffffff810c622a>] ? finish_task_switch+0x7a/0x200
[Sat Oct 29 00:23:34 2016]  [<ffffffff816a82f6>] ? __schedule+0x1d6/0x680
[Sat Oct 29 00:23:34 2016]  [<ffffffff816aaef9>] ? __mutex_unlock_slowpath+0xa9/0x140
[Sat Oct 29 00:23:34 2016]  [<ffffffff816ace89>] ? _raw_spin_unlock+0x9/0x10
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10ce4ec>] zio_done+0xa8c/0x17f0 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10cd597>] ? zio_ready+0x257/0x720 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffff816ace89>] ? _raw_spin_unlock+0x9/0x10
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10ca87e>] zio_nowait+0xce/0x280 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa0ff1464>] dbuf_prefetch+0x4d4/0x610 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa0ffcb2c>] dmu_prefetch+0x21c/0x240 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa100a402>] receive_read_prefetch+0x162/0x1c0 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa100d6fe>] dmu_recv_stream+0xb2e/0xff0 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10a175c>] zfs_ioc_recv_impl+0x11c/0xd50 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10a250d>] zfs_ioc_recv+0x17d/0x220 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa0fef296>] ? dbuf_rele+0x36/0x70 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10167ec>] ? dnode_rele_and_unlock+0x7c/0xc0 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffffa10a38fe>] zfsdev_ioctl+0x60e/0x720 [zfs]
[Sat Oct 29 00:23:34 2016]  [<ffffffff811e7a6b>] do_vfs_ioctl+0x8b/0x5a0
[Sat Oct 29 00:23:34 2016]  [<ffffffff810946da>] ? __do_page_fault+0x1ca/0x430
[Sat Oct 29 00:23:34 2016]  [<ffffffff811e7ff4>] SyS_ioctl+0x74/0x80
[Sat Oct 29 00:23:34 2016]  [<ffffffff816ad11b>] entry_SYSCALL_64_fastpath+0x13/0x8f

behlendorf commented 7 years ago

@tuomari the deadlist issue mentioned above may be resolved by https://github.com/zfsonlinux/zfs/pull/5752/commits/deb33f347c4bb30c3d200e042473367a41d55f17 which is under going review.

tuomari commented 7 years ago

@behlendorf Great! I'll see if would get the patch running..

I just hit this couple days ago, and might be related:

ZFS:  0.7.0-rc3_73_g582cc01
SPL:  0.7.0-rc3_2_g9704820

Feb 10 00:30:38 helvi kernel: [53057.736217] BUG: unable to handle kernel paging request at 0000000037333133
Feb 10 00:30:38 helvi kernel: [53057.736256] IP: [<ffffffffa0069530>] avl_destroy_nodes+0xc0/0x2c0 [zavl]
Feb 10 00:30:38 helvi kernel: [53057.736290] PGD 0 
Feb 10 00:30:38 helvi kernel: [53057.736299] 
Feb 10 00:30:38 helvi kernel: [53057.736310] Oops: 0000 [#1] SMP
Feb 10 00:30:38 helvi kernel: [53057.736325] Modules linked in: binfmt_misc 8021q garp ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter ip_tables xt_conntrack x_tables br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc zfs(PO) zunicode(PO) zavl(PO) icp(PO) iTCO_wdt iTCO_vendor_support gpio_ich crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw pcspkr i2c_i801 i2c_smbus bttv ast tveeprom videobuf_dma_sg ttm tea575x videobuf_core drm_kms_helper rc_core syscopyarea v4l2_common sysfillrect sysimgblt
Feb 10 00:30:38 helvi kernel: [53057.736997]  fb_sys_fops videodev zcommon(PO) media drm agpgart lpc_ich mfd_core znvpair(PO) spl(O) snd_hda_intel snd_hda_codec snd_hda_core btrfs snd_pcm i7core_edac snd_timer ioatdma snd soundcore zlib_deflate edac_core i5500_temp dca shpchp ipmi_si ipmi_msghandler evdev acpi_cpufreq tpm_tis tpm_tis_core tpm w83795 w83627ehf hwmon_vid coretemp autofs4 hid_generic usbhid hid crc32c_intel psmouse uhci_hcd ehci_pci ehci_hcd usbcore usb_common sg sd_mod
Feb 10 00:30:38 helvi kernel: [53057.737490] CPU: 11 PID: 2591 Comm: txg_sync Tainted: P           O    4.9.1.remod.kvm.ovs.1 #1
Feb 10 00:30:38 helvi kernel: [53057.737594] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
Feb 10 00:30:38 helvi kernel: [53057.737712] task: ffff880420d169c0 task.stack: ffffc9000e3c4000
Feb 10 00:30:38 helvi kernel: [53057.737762] RIP: 0010:[<ffffffffa0069530>]  [<ffffffffa0069530>] avl_destroy_nodes+0xc0/0x2c0 [zavl]
Feb 10 00:30:38 helvi kernel: [53057.737884] RSP: 0018:ffffc9000e3c7be0  EFLAGS: 00010206
Feb 10 00:30:38 helvi kernel: [53057.737951] RAX: 0000000037333133 RBX: 0000000037333133 RCX: ffffea00066c7680
Feb 10 00:30:38 helvi kernel: [53057.738042] RDX: 0000000000000005 RSI: ffffc9000e3c7c20 RDI: ffff88041dc7f400
Feb 10 00:30:38 helvi kernel: [53057.738109] RBP: ffffc9000e3c7c10 R08: 0000000000000001 R09: ffff8800810ea140
Feb 10 00:30:38 helvi kernel: [53057.738199] R10: ffff88042fddb201 R11: ffffea0002043a80 R12: ffff8800810ea000
Feb 10 00:30:38 helvi kernel: [53057.738307] R13: ffffc9000e3c7c20 R14: 0000000000000000 R15: ffff88041dc7f400
Feb 10 00:30:38 helvi kernel: [53057.738423] FS:  0000000000000000(0000) GS:ffff88042fdc0000(0000) knlGS:0000000000000000
Feb 10 00:30:38 helvi kernel: [53057.738540] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 10 00:30:38 helvi kernel: [53057.738640] CR2: 0000000037333133 CR3: 0000000001c08000 CR4: 00000000000006e0
Feb 10 00:30:38 helvi kernel: [53057.738746] Stack:
Feb 10 00:30:38 helvi kernel: [53057.738811]  ffffffffa03dd3e7 ffff8800810ea680 0000000000000000 ffff88041dc7f400
Feb 10 00:30:38 helvi kernel: [53057.738929]  0000000000000000 ffff88041dc7f000 ffffc9000e3c7c48 ffffffffa0ccbf0f
Feb 10 00:30:38 helvi kernel: [53057.739080]  ffff8800810ea140 ffff88041dc7f000 00000000013154b7 ffff88041dc7f010
Feb 10 00:30:38 helvi kernel: [53057.739235] Call Trace:
Feb 10 00:30:38 helvi kernel: [53057.739291]  [<ffffffffa03dd3e7>] ? spl_kmem_cache_free+0x147/0x260 [spl]
Feb 10 00:30:38 helvi kernel: [53057.739443]  [<ffffffffa0ccbf0f>] range_tree_vacate+0x7f/0xe0 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.739592]  [<ffffffffa0cc584a>] metaslab_unload+0x2a/0x80 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.739695]  [<ffffffffa0cc6b06>] metaslab_sync_done+0x2e6/0xbf0 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.739883]  [<ffffffffa0cf26e4>] vdev_sync_done+0x44/0xa0 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.740012]  [<ffffffffa0cd7f21>] spa_sync+0x6b1/0x1220 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.740089]  [<ffffffff810d117d>] ? default_wake_function+0xd/0x10
Feb 10 00:30:38 helvi kernel: [53057.740236]  [<ffffffffa0cede57>] txg_sync_thread+0x2d7/0x4f0 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.740387]  [<ffffffffa0cedb80>] ? txg_init+0x280/0x280 [zfs]
Feb 10 00:30:38 helvi kernel: [53057.740451]  [<ffffffffa03df66f>] thread_generic_wrapper+0x7f/0xc0 [spl]
Feb 10 00:30:38 helvi kernel: [53057.740548]  [<ffffffffa03df5f0>] ? __thread_exit+0x20/0x20 [spl]
Feb 10 00:30:38 helvi kernel: [53057.740652]  [<ffffffff810c7079>] kthread+0xe9/0x100
Feb 10 00:30:38 helvi kernel: [53057.740745]  [<ffffffff810c6f90>] ? kthread_park+0x60/0x60
Feb 10 00:30:38 helvi kernel: [53057.740804]  [<ffffffff810c6f90>] ? kthread_park+0x60/0x60
Feb 10 00:30:38 helvi kernel: [53057.740894]  [<ffffffff816d5cd2>] ret_from_fork+0x22/0x30
Feb 10 00:30:38 helvi kernel: [53057.740988] Code: 06 a0 48 c7 c1 c3 b7 06 a0 31 c0 ba 1d 04 00 00 48 c7 c6 20 b1 06 a0 48 c7 c7 10 b2 06 a0 e8 18 a6 37 00 eb 61 49 89 dc 48 89 c3 <48> 8b 03 48 85 c0 75 f2 48 8b 43 10 48 8b 53 08 83 e0 03 48 85 
Feb 10 00:30:38 helvi kernel: [53057.741410] RIP  [<ffffffffa0069530>] avl_destroy_nodes+0xc0/0x2c0 [zavl]
Feb 10 00:30:38 helvi kernel: [53057.741502]  RSP <ffffc9000e3c7be0>
Feb 10 00:30:38 helvi kernel: [53057.741551] CR2: 0000000037333133
Feb 10 00:30:38 helvi kernel: [53057.741927] ---[ end trace a4289853d827430b ]---
Feb 10 00:30:38 helvi kernel: [53057.742078] BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
Feb 10 00:30:38 helvi kernel: [53057.742329] IP: [<ffffffff810e4875>] __wake_up_common+0x25/0x80
Feb 10 00:30:38 helvi kernel: [53057.742505] PGD 0 
Feb 10 00:30:38 helvi kernel: [53057.742559] 
Feb 10 00:30:38 helvi kernel: [53057.742689] Oops: 0000 [#2] SMP
Feb 10 00:30:38 helvi kernel: [53057.742794] Modules linked in: binfmt_misc 8021q garp ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter ip_tables xt_conntrack x_tables br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc zfs(PO) zunicode(PO) zavl(PO) icp(PO) iTCO_wdt iTCO_vendor_support gpio_ich crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw pcspkr i2c_i801 i2c_smbus bttv ast tveeprom videobuf_dma_sg ttm tea575x videobuf_core drm_kms_helper rc_core syscopyarea v4l2_common sysfillrect sysimgblt
Feb 10 00:30:38 helvi kernel: [53057.746682]  fb_sys_fops videodev zcommon(PO) media drm agpgart lpc_ich mfd_core znvpair(PO) spl(O) snd_hda_intel snd_hda_codec snd_hda_core btrfs snd_pcm i7core_edac snd_timer ioatdma snd soundcore zlib_deflate edac_core i5500_temp dca shpchp ipmi_si ipmi_msghandler evdev acpi_cpufreq tpm_tis tpm_tis_core tpm w83795 w83627ehf hwmon_vid coretemp autofs4 hid_generic usbhid hid crc32c_intel psmouse uhci_hcd ehci_pci ehci_hcd usbcore usb_common sg sd_mod
Feb 10 00:30:38 helvi kernel: [53057.749387] CPU: 11 PID: 2591 Comm: txg_sync Tainted: P      D    O    4.9.1.remod.kvm.ovs.1 #1
Feb 10 00:30:38 helvi kernel: [53057.749599] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
Feb 10 00:30:38 helvi kernel: [53057.749796] task: ffff880420d169c0 task.stack: ffffc9000e3c4000
Feb 10 00:30:38 helvi kernel: [53057.749934] RIP: 0010:[<ffffffff810e4875>]  [<ffffffff810e4875>] __wake_up_common+0x25/0x80
Feb 10 00:30:38 helvi kernel: [53057.750140] RSP: 0018:ffffc9000e3c7e50  EFLAGS: 00010086
Feb 10 00:30:38 helvi kernel: [53057.750267] RAX: 0000000000000286 RBX: ffffc9000e3c7f08 RCX: 0000000000000000
Feb 10 00:30:38 helvi kernel: [53057.750394] RDX: 0000000000000009 RSI: 0000000000000003 RDI: ffffc9000e3c7f08
Feb 10 00:30:38 helvi kernel: [53057.750514] RBP: ffffc9000e3c7e88 R08: 0000000000000000 R09: 0000000000000000
Feb 10 00:30:38 helvi kernel: [53057.750667] R10: 00000000000005e9 R11: ffffffff81882480 R12: ffffc9000e3c7f08
Feb 10 00:30:38 helvi kernel: [53057.750788] R13: 0000000000000286 R14: 0000000000000000 R15: 0000000000000003
Feb 10 00:30:38 helvi kernel: [53057.750920] FS:  0000000000000000(0000) GS:ffff88042fdc0000(0000) knlGS:0000000000000000
Feb 10 00:30:38 helvi kernel: [53057.751131] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 10 00:30:38 helvi kernel: [53057.751226] CR2: 0000000000000009 CR3: 0000000001c08000 CR4: 00000000000006e0
Feb 10 00:30:38 helvi kernel: [53057.751334] Stack:
Feb 10 00:30:38 helvi kernel: [53057.751414]  0000000181e86858 0000000000000000 ffffc9000e3c7f08 ffffc9000e3c7f00
Feb 10 00:30:38 helvi kernel: [53057.751724]  0000000000000286 0000000000000000 000000000000000b ffffc9000e3c7e98
Feb 10 00:30:38 helvi kernel: [53057.752002]  ffffffff810e48de ffffc9000e3c7ec0 ffffffff810e52b2 ffff880420d17000
Feb 10 00:30:38 helvi kernel: [53057.752292] Call Trace:
Feb 10 00:30:38 helvi kernel: [53057.752371]  [<ffffffff810e48de>] __wake_up_locked+0xe/0x10
Feb 10 00:30:38 helvi kernel: [53057.752472]  [<ffffffff810e52b2>] complete+0x32/0x50
Feb 10 00:30:38 helvi kernel: [53057.752572]  [<ffffffff810a75e0>] mm_release+0xb0/0x130
Feb 10 00:30:38 helvi kernel: [53057.752669]  [<ffffffff810ad4c2>] do_exit+0x132/0xb10
Feb 10 00:30:38 helvi kernel: [53057.752761]  [<ffffffff816d7997>] rewind_stack_do_exit+0x17/0x20
Feb 10 00:30:38 helvi kernel: [53057.752869] Code: 00 00 00 00 00 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 53 48 83 ec 10 89 55 cc 48 8b 57 18 4c 89 45 d0 <4c> 8b 02 48 8d 42 e8 48 39 c7 4d 8d 68 e8 75 05 eb 34 49 89 d5 
Feb 10 00:30:38 helvi kernel: [53057.755564] RIP  [<ffffffff810e4875>] __wake_up_common+0x25/0x80
Feb 10 00:30:38 helvi kernel: [53057.755707]  RSP <ffffc9000e3c7e50>
Feb 10 00:30:38 helvi kernel: [53057.755780] CR2: 0000000000000009
Feb 10 00:30:38 helvi kernel: [53057.755871] ---[ end trace a4289853d827430c ]---
Feb 10 00:30:38 helvi kernel: [53057.755967] Fixing recursive fault but reboot is needed!

mgmartin commented 7 years ago

I had a machine hit what looks like a similar panic as the original post last night. Running latest git of spl/zfs and kernel 4.9.31:

[52443.038324] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[52443.038345] IP: [] buf_hash_remove+0x65/0xb0 [zfs]
[52443.038368] PGD 0 [52443.038370] 
[52443.038374] Oops: 0000 [#1] SMP
[52443.038376] Modules linked in: dm_crypt algif_skcipher af_alg dm_mod ext4 crc16 jbd2 fscrypto mbcache loop vhost_net vhost macvtap macvlan tun ebtable_filter ebtables ip6_tables iptable_filter netconsole 8021q mrp bonding bridge stp llc msr uinput nct6775 hwmon_vid snd_hda_codec_hdmi eeepc_wmi asus_wmi sparse_keymap iTCO_wdt iTCO_vendor_support rfkill mxm_wmi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel uvcvideo nls_iso8859_1 aesni_intel videobuf2_vmalloc aes_x86_64 videobuf2_memops lrw gf128mul videobuf2_v4l2 glue_helper videobuf2_core ablk_helper cryptd nls_cp437 snd_usb_audio videodev i2c_i801 vfat snd_usbmidi_lib intel_cstate fat pcspkr psmouse i2c_smbus mousedev media snd_rawmidi input_leds intel_rapl_perf snd_hda_intel led_class snd_seq_device evdev joydev snd_hda_codec mac_hid snd_hda_core snd_hwdep r8169 snd_pcm snd_timer mii snd soundcore e1000e lpc_ich ie31200_edac mei_me mei edac_core ptp shpchp fan pps_core thermal battery tpm_tis fjes tpm_tis_core wmi tpm video button sch_fq_codel usbip_host usbip_core sg nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables xfs libcrc32c crc32c_generic hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid sr_mod cdrom sd_mod uas usb_storage serio_raw atkbd libps2 ahci xhci_pci libahci ehci_pci xhci_hcd ehci_hcd libata crc32c_intel scsi_mod usbcore usb_common i8042 serio nvidia_drm(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm nvidia_uvm(PO) nvidia_modeset(PO) nvidia(PO) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O)
[52443.038685] CPU: 2 PID: 75 Comm: kswapd0 Tainted: P           O    4.9.31-mgm #1
[52443.038688] Hardware name: System manufacturer System Product Name/P8Z77-V DELUXE, BIOS 2104 08/13/2013
[52443.038692] task: ffff880602bfce00 task.stack: ffffc900037e8000
[52443.038694] RIP: 0010:[]  [] buf_hash_remove+0x65/0xb0 [zfs]
[52443.038708] RSP: 0018:ffffc900037ebb78  EFLAGS: 00010207
[52443.038711] RAX: 00000000003e0776 RBX: ffff880600004000 RCX: 0000000000000000
[52443.038714] RDX: ffffc90009f04bb0 RSI: ffffffffa0260e40 RDI: ffff880600004000
[52443.038717] RBP: ffffffffa02e1a40 R08: ffff880600004010 R09: c82a532345e3bf21
[52443.038719] R10: 0000000000046ab6 R11: ffff88061ec97f20 R12: 0000000000000000
[52443.038721] R13: ffffffffa02e19c0 R14: 0000000000000001 R15: 0000000000000001
[52443.038724] FS:  0000000000000000(0000) GS:ffff88061ec80000(0000) knlGS:0000000000000000
[52443.038726] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[52443.038729] CR2: 0000000000000020 CR3: 0000000002807000 CR4: 00000000001426e0
[52443.038732] Stack:
[52443.038735]  ffffffffa00ffc85 0000000201314148 000000000001dd80 ffff8805ff1ca980
[52443.038748]  0000000000004000 ffff880600004000 0000000000000001 ffff88028c6a87b0
[52443.038758]  ffffffffa01040ef 00000000005f2a70 0000000000000000 0000000500000003
[52443.038767] Call Trace:
[52443.038777]  [] ? arc_change_state.isra.13+0x195/0x3d0 [zfs]
[52443.038786]  [] ? arc_evict_state+0x60f/0x7c0 [zfs]
[52443.038799]  [] ? spa_get_random+0x34/0x60 [zfs]
[52443.038804]  [] ? mutex_lock+0x9/0x30
[52443.038813]  [] ? arc_adjust+0x398/0x520 [zfs]
[52443.038822]  [] ? __arc_shrinker_func.isra.26+0xd2/0x140 [zfs]
[52443.038832]  [] ? arc_shrinker_func_scan_objects+0x1f/0x30 [zfs]
[52443.038836]  [] ? shrink_slab.part.6+0x1b4/0x2b0
[52443.038840]  [] ? shrink_node+0x2d9/0x2f0
[52443.038843]  [] ? kswapd+0x2b2/0x680
[52443.038846]  [] ? mem_cgroup_shrink_node+0xb0/0xb0
[52443.038850]  [] ? kthread+0xd2/0xf0
[52443.038853]  [] ? kthread_park+0x50/0x50
[52443.038857]  [] ? ret_from_fork+0x25/0x30
[52443.038860] Code: 89 ca 48 c1 ea 08 4c 31 d2 48 31 d0 48 8b 15 9b 3a 16 00 48 23 05 8c 3a 16 00 48 8d 14 c2 48 8b 0a 48 39 f9 75 05 eb 10 48 89 d1 <48> 8b 51 20 48 39 d7 75 f4 48 8d 51 20 48 8b 4f 20 48 89 0a 83 
[52443.038972] RIP  [] buf_hash_remove+0x65/0xb0 [zfs]
[52443.038982]  RSP 
[52443.038984] CR2: 0000000000000020
[52443.038989] ---[ end trace 51076e126836969e ]---
[52443.038992] Kernel panic - not syncing: Fatal exception

tuomari commented 7 years ago

I hit the previously mentioned BUG again:

The system had almost two months of uptime after I hit #6220

Jul  9 04:53:22 helvi kernel: [96828.734028] BUG: unable to handle kernel paging request at ffffebe004000020
Jul  9 04:53:22 helvi kernel: [96828.734087] IP: [<ffffffff811ca9d0>] kmem_cache_free+0x60/0x1d0
Jul  9 04:53:22 helvi kernel: [96828.734141] PGD 0 
Jul  9 04:53:22 helvi kernel: [96828.734149] 
Jul  9 04:53:22 helvi kernel: [96828.734187] Oops: 0000 [#1] SMP
Jul  9 04:53:22 helvi kernel: [96828.734227] Modules linked in: binfmt_misc iscsi_target_mod target_core_mod 8021q garp ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter ip_tables xt_conntrack x_tables br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack configfs nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc iTCO_wdt gpio_ich iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw pcspkr bttv tveeprom videobuf_dma_sg ast tea575x ftdi_sio videobuf_core ttm rc_core usbserial drm_kms_helper v4l2_common syscopyarea sysfillrect sysimgblt
Jul  9 04:53:22 helvi kernel: [96828.734869]  videodev fb_sys_fops zfs(PO) media drm agpgart i2c_i801 lpc_ich zunicode(PO) i2c_smbus mfd_core zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs snd_hda_intel snd_hda_codec zlib_deflate snd_hda_core snd_pcm snd_timer i7core_edac ioatdma snd soundcore edac_core i5500_temp dca shpchp ipmi_si ipmi_msghandler evdev acpi_cpufreq tpm_tis tpm_tis_core tpm w83795 w83627ehf hwmon_vid coretemp autofs4 hid_generic usbhid hid usb_storage crc32c_intel psmouse ehci_pci uhci_hcd ehci_hcd usbcore usb_common sg sd_mod
Jul  9 04:53:22 helvi kernel: [96828.735315] CPU: 2 PID: 943 Comm: arc_reclaim Tainted: P           O    4.9.20.iudex.kvm.ovs.1 #1
Jul  9 04:53:22 helvi kernel: [96828.735400] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
Jul  9 04:53:22 helvi kernel: [96828.735487] task: ffff880826e0af00 task.stack: ffffc900041dc000
Jul  9 04:53:22 helvi kernel: [96828.735537] RIP: 0010:[<ffffffff811ca9d0>]  [<ffffffff811ca9d0>] kmem_cache_free+0x60/0x1d0
Jul  9 04:53:22 helvi kernel: [96828.735622] RSP: 0018:ffffc900041dfc28  EFLAGS: 00010286
Jul  9 04:53:22 helvi kernel: [96828.735669] RAX: ffffebe004000000 RBX: 0000000100000000 RCX: 00000000483da600
Jul  9 04:53:22 helvi kernel: [96828.735723] RDX: 000077ff80000000 RSI: 0000000100000000 RDI: ffff88042f802dc0
Jul  9 04:53:22 helvi kernel: [96828.735776] RBP: ffffc900041dfc40 R08: 0000000000000002 R09: ffffffffa03e7ab5
Jul  9 04:53:22 helvi kernel: [96828.735829] R10: 0000000000000000 R11: ffffea0000000000 R12: ffff88042f802dc0
Jul  9 04:53:22 helvi kernel: [96828.735882] R13: ffff880109ac95e0 R14: ffff880829d50c00 R15: 0000000000000002
Jul  9 04:53:22 helvi kernel: [96828.735936] FS:  0000000000000000(0000) GS:ffff88042fc80000(0000) knlGS:0000000000000000
Jul  9 04:53:22 helvi kernel: [96828.736018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul  9 04:53:22 helvi kernel: [96828.736067] CR2: ffffebe004000020 CR3: 0000000001c08000 CR4: 00000000000006e0
Jul  9 04:53:22 helvi kernel: [96828.736120] Stack:
Jul  9 04:53:22 helvi kernel: [96828.736155]  ffff88042c700800 0000000100000000 ffff880109ac95e0 ffffc900041dfc78
Jul  9 04:53:22 helvi kernel: [96828.736243]  ffffffffa03e7ab5 000000000000000f 0000000100000000 ffff880109ac95e0
Jul  9 04:53:22 helvi kernel: [96828.736330]  ffff880829d50c00 0000000000000002 ffffc900041dfc98 ffffffffa06963b5
Jul  9 04:53:22 helvi kernel: [96828.736450] Call Trace:
Jul  9 04:53:22 helvi kernel: [96828.736497]  [<ffffffffa03e7ab5>] spl_kmem_cache_free+0x115/0x190 [spl]
Jul  9 04:53:22 helvi kernel: [96828.736597]  [<ffffffffa06963b5>] zio_data_buf_free+0x55/0x60 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.736673]  [<ffffffffa05e3955>] abd_free+0x35/0x230 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.736747]  [<ffffffffa05e8312>] arc_hdr_free_pabd+0x62/0x1a0 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.736824]  [<ffffffffa05eb071>] arc_evict_state+0x691/0x7f0 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.736900]  [<ffffffffa05eb259>] arc_adjust_impl.constprop.34+0x29/0x30 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.737006]  [<ffffffffa05eb628>] arc_adjust+0x3c8/0x480 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.737100]  [<ffffffffa05ec26c>] arc_reclaim_thread+0x7c/0x230 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.737179]  [<ffffffffa05ec1f0>] ? arc_shrink+0xb0/0xb0 [zfs]
Jul  9 04:53:22 helvi kernel: [96828.737244]  [<ffffffffa03e9356>] thread_generic_wrapper+0x76/0x90 [spl]
Jul  9 04:53:22 helvi kernel: [96828.737322]  [<ffffffffa03e92e0>] ? __thread_exit+0x20/0x20 [spl]
Jul  9 04:53:22 helvi kernel: [96828.737373]  [<ffffffff810c7069>] kthread+0xe9/0x100
Jul  9 04:53:22 helvi kernel: [96828.737420]  [<ffffffff810c6f80>] ? kthread_park+0x60/0x60
Jul  9 04:53:22 helvi kernel: [96828.737469]  [<ffffffff816d75d2>] ret_from_fork+0x22/0x30
Jul  9 04:53:22 helvi kernel: [96828.737516] Code: 00 00 00 80 ff 77 00 00 49 bb 00 00 00 00 00 ea ff ff 48 01 d8 48 0f 42 15 4e 26 a4 00 48 01 d0 48 c1 e8 0c 48 c1 e0 06 4c 01 d8 <48> 8b 50 20 4c 8d 5a ff 83 e2 01 4c 0f 44 d8 4c 8b 17 65 49 8b 
Jul  9 04:53:22 helvi kernel: [96828.737798] RIP  [<ffffffff811ca9d0>] kmem_cache_free+0x60/0x1d0
Jul  9 04:53:22 helvi kernel: [96828.737850]  RSP <ffffc900041dfc28>
Jul  9 04:53:22 helvi kernel: [96828.737890] CR2: ffffebe004000020
Jul  9 04:53:22 helvi kernel: [96828.738197] ---[ end trace 185899d637de6a50 ]---
Jul  9 04:53:22 helvi kernel: [96828.738295] BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
Jul  9 04:53:22 helvi kernel: [96828.738481] IP: [<ffffffff810e4885>] __wake_up_common+0x25/0x80
Jul  9 04:53:22 helvi kernel: [96828.738599] PGD 0 
Jul  9 04:53:22 helvi kernel: [96828.738640] 
Jul  9 04:53:22 helvi kernel: [96828.738741] Oops: 0000 [#2] SMP
Jul  9 04:53:22 helvi kernel: [96828.738813] Modules linked in: binfmt_misc iscsi_target_mod target_core_mod 8021q garp ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter ip_tables xt_conntrack x_tables br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack configfs nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc iTCO_wdt gpio_ich iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw pcspkr bttv tveeprom videobuf_dma_sg ast tea575x ftdi_sio videobuf_core ttm rc_core usbserial drm_kms_helper v4l2_common syscopyarea sysfillrect sysimgblt
Jul  9 04:53:22 helvi kernel: [96828.741875]  videodev fb_sys_fops zfs(PO) media drm agpgart i2c_i801 lpc_ich zunicode(PO) i2c_smbus mfd_core zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs snd_hda_intel snd_hda_codec zlib_deflate snd_hda_core snd_pcm snd_timer i7core_edac ioatdma snd soundcore edac_core i5500_temp dca shpchp ipmi_si ipmi_msghandler evdev acpi_cpufreq tpm_tis tpm_tis_core tpm w83795 w83627ehf hwmon_vid coretemp autofs4 hid_generic usbhid hid usb_storage crc32c_intel psmouse ehci_pci uhci_hcd ehci_hcd usbcore usb_common sg sd_mod
Jul  9 04:53:22 helvi kernel: [96828.744212] CPU: 2 PID: 943 Comm: arc_reclaim Tainted: P      D    O    4.9.20.iudex.kvm.ovs.1 #1
Jul  9 04:53:22 helvi kernel: [96828.744330] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
Jul  9 04:53:22 helvi kernel: [96828.744450] task: ffff880826e0af00 task.stack: ffffc900041dc000
Jul  9 04:53:22 helvi kernel: [96828.744531] RIP: 0010:[<ffffffff810e4885>]  [<ffffffff810e4885>] __wake_up_common+0x25/0x80
Jul  9 04:53:22 helvi kernel: [96828.744683] RSP: 0018:ffffc900041dfe50  EFLAGS: 00010086
Jul  9 04:53:22 helvi kernel: [96828.744763] RAX: 0000000000000286 RBX: ffffc900041dff08 RCX: 0000000000000000
Jul  9 04:53:22 helvi kernel: [96828.744848] RDX: 0000000000000009 RSI: 0000000000000003 RDI: ffffc900041dff08
Jul  9 04:53:22 helvi kernel: [96828.744934] RBP: ffffc900041dfe88 R08: 0000000000000000 R09: 0000000000000000
Jul  9 04:53:22 helvi kernel: [96828.745019] R10: 00000000000006ea R11: ffffffff81882500 R12: ffffc900041dff08
Jul  9 04:53:22 helvi kernel: [96828.745104] R13: 0000000000000286 R14: 0000000000000000 R15: 0000000000000003
Jul  9 04:53:22 helvi kernel: [96828.745190] FS:  0000000000000000(0000) GS:ffff88042fc80000(0000) knlGS:0000000000000000
Jul  9 04:53:22 helvi kernel: [96828.745305] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul  9 04:53:22 helvi kernel: [96828.745386] CR2: 0000000000000009 CR3: 0000000001c08000 CR4: 00000000000006e0
Jul  9 04:53:22 helvi kernel: [96828.745471] Stack:
Jul  9 04:53:22 helvi kernel: [96828.745539]  00000001000006ea 0000000000000000 ffffc900041dff08 ffffc900041dff00
Jul  9 04:53:22 helvi kernel: [96828.745790]  0000000000000286 0000000000000000 000000000000000b ffffc900041dfe98
Jul  9 04:53:22 helvi kernel: [96828.746042]  ffffffff810e48ee ffffc900041dfec0 ffffffff810e52c2 ffff880826e0b540
Jul  9 04:53:22 helvi kernel: [96828.746294] Call Trace:
Jul  9 04:53:22 helvi kernel: [96828.746366]  [<ffffffff810e48ee>] __wake_up_locked+0xe/0x10
Jul  9 04:53:22 helvi kernel: [96828.746447]  [<ffffffff810e52c2>] complete+0x32/0x50
Jul  9 04:53:22 helvi kernel: [96828.746528]  [<ffffffff810a75f0>] mm_release+0xb0/0x130
Jul  9 04:53:22 helvi kernel: [96828.746609]  [<ffffffff810ad4d2>] do_exit+0x132/0xb10
Jul  9 04:53:22 helvi kernel: [96828.746690]  [<ffffffff816d9297>] rewind_stack_do_exit+0x17/0x20
Jul  9 04:53:22 helvi kernel: [96828.746772] Code: 00 00 00 00 00 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 53 48 83 ec 10 89 55 cc 48 8b 57 18 4c 89 45 d0 <4c> 8b 02 48 8d 42 e8 48 39 c7 4d 8d 68 e8 75 05 eb 34 49 89 d5 
Jul  9 04:53:22 helvi kernel: [96828.749225] RIP  [<ffffffff810e4885>] __wake_up_common+0x25/0x80
Jul  9 04:53:22 helvi kernel: [96828.749343]  RSP <ffffc900041dfe50>
Jul  9 04:53:22 helvi kernel: [96828.749417] CR2: 0000000000000009
Jul  9 04:53:22 helvi kernel: [96828.749491] ---[ end trace 185899d637de6a51 ]---
Jul  9 04:53:22 helvi kernel: [96828.749568] Fixing recursive fault but reboot is needed!

bill-mcgonigle commented 7 years ago

I believe I'm seeing this with:

kernel-4.11.12-200.fc25.x86_64
zfs-0.7.0-1_g0f69f42.fc25.x86_64
spl-0.7.0-1.fc25.x86_64

The culprit appears to have been the move from kernel 4.4.14 to 4.11.12 (Fedora 22 to 25). I rebuilt the zfs modules after I did a system upgrade, but if git advanced at all, it was very minor, and I see this bug dates back to October. I believe I should have the #5071 fix in my pull. I notice @tuomari was on 4.4.18, though it would be surprising if 4.4.14->18 would have had a big regression nobody noticed.

I only have cache and log devices on my backups zpool, but corruption has been spread out to other pools with neither log nor cache (one SAS-SSD on LUKS, one SATA-rust on raw devices) so I don't know if this is really an L2ARC problem or if it's a symptom of some underlying problem that's corrupting all the pools. For now I removed cache and log and am saturating the pool with parallel rsyncs (100% utilization on the underlying devices) to see what happens (and to get a good backup in case I need to move the main drives).

I did a big downtime last night to run extensive memory tests (never were any mcelog events), controller buffer tests, drive pattern tests, etc. and everything seems to check out, hardware-wise, so while it could be a hardware coincidence, the OS upgrade does appear to have been the trigger. I've literally had more ZoL corruption in the past few weeks (so far, always on zvols on raidz2, not files on mirrors) than in the previous five years.

[69010.889494] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[69010.892170] IP: buf_hash_remove+0x6e/0xc0 [zfs]
[69010.894635] PGD 0 

[69010.900542] Oops: 0000 [#1] SMP
[69010.904340] Modules linked in: xfs ebtable_filter ebtables ip6_tables ip_set_hash_ip ip_set nfnetlink bridge 8021q garp mrp stp llc cfg80211 rfkill nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack libcrc32c iTCO_wdt iTCO_vendor_support zfs(POE) intel_rapl zunicode(POE) zavl(POE) icp(POE) sb_edac edac_core dm_crypt x86_pkg_temp_thermal joydev intel_powerclamp zcommon(POE) coretemp znvpair(POE) crct10dif_pclmul crc32_pclmul spl(OE) crc32c_intel ghash_clmulni_intel intel_rapl_perf lpc_ich i2c_i801 igb ptp mei_me pps_core mei ioatdma ipmi_si ipmi_devintf dca ipmi_msghandler shpchp wmi tpm_tis tpm_tis_core tpm binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc xenfs xen_privcmd raid1 mgag200 i2c_algo_bit drm_kms_helper ttm serio_raw drm mpt3sas isci libsas raid_class
[69010.924383]  scsi_transport_sas xen_acpi_processor xen_scsiback target_core_mod xen_pciback xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn
[69010.927057] CPU: 8 PID: 108 Comm: kswapd0 Tainted: P        W  OE   4.11.11-200.fc25.x86_64 #1
[69010.929838] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0 07/05/2013
[69010.932503] task: ffff880676500000 task.stack: ffffc90007004000
[69010.935360] RIP: e030:buf_hash_remove+0x6e/0xc0 [zfs]
[69010.938293] RSP: e02b:ffffc90007007b20 EFLAGS: 00010286
[69010.941044] RAX: 00000000003c83f6 RBX: ffff880468440e18 RCX: 0000000000000000
[69010.943603] RDX: 0000000000000000 RSI: ffffffffc0759e40 RDI: ffff880468440e18
[69010.945990] RBP: ffffc90007007b20 R08: ffff880468440e28 R09: a43c3a4e2b762bf2
[69010.948101] R10: 000000006422d4dc R11: 000000000001d710 R12: ffff880662fb89c0
[69010.950354] R13: ffff88053c3160c0 R14: ffff8805ae32ccd0 R15: ffff880675978800
[69010.952441] FS:  00007f4d507cce80(0000) GS:ffff880681e00000(0000) knlGS:0000000000000000
[69010.954508] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[69010.956575] CR2: 0000000000000020 CR3: 0000000001e09000 CR4: 0000000000042660
[69010.958632] Call Trace:
[69010.960753]  arc_hdr_realloc+0x3c/0x2c0 [zfs]
[69010.962807]  ? arc_change_state.isra.20+0x395/0x3c0 [zfs]
[69010.964881]  arc_evict_state+0x512/0x870 [zfs]
[69010.966945]  arc_adjust+0x4e2/0x680 [zfs]
[69010.969019]  arc_shrink+0x49/0xb0 [zfs]
[69010.971057]  __arc_shrinker_func.isra.32+0xd6/0x170 [zfs]
[69010.973087]  arc_shrinker_func_scan_objects+0x15/0x30 [zfs]
[69010.975037]  shrink_slab.part.45+0x1f5/0x410
[69010.976951]  shrink_node+0x22f/0x320
[69010.978814]  kswapd+0x300/0x730
[69010.980741]  kthread+0x109/0x140
[69010.982574]  ? mem_cgroup_shrink_node+0x170/0x170
[69010.984351]  ? kthread_park+0x90/0x90
[69010.986095]  ret_from_fork+0x25/0x30
[69010.987788] Code: 89 ca 48 c1 ea 08 4c 31 d2 48 31 d0 48 23 05 3a d9 17 00 48 8b 15 3b d9 17 00 48 8d 14 c2 48 8b 0a 48 39 cf 75 05 eb 10 48 89 d1 <48> 8b 51 20 48 39 d7 75 f4 48 8d 51 20 48 8b 4f 20 48 89 0a 48 
[69010.991440] RIP: buf_hash_remove+0x6e/0xc0 [zfs] RSP: ffffc90007007b20
[69010.993134] CR2: 0000000000000020
[69011.001686] ---[ end trace 26958b56c5b11cdb ]---

MarkGavalda commented 7 years ago

We just encountered this issue with Ubuntu 4.4.0-87-generic (official Ubuntu 16.04 kernel, currently the latest) ZFS v0.6.5.10-1ubuntu2~16.04.york0

We have dozens of other machines (all on Google Cloud) running the exact same setup, this is the first occurrence:

[182902.222705] BUG: unable to handle kernel NULL pointer dereference at 00000000000001a0
[182902.231038] IP: [<ffffffff813d4718>] blk_mq_check_expired+0x18/0x80
[182902.237619] PGD 0 
[182902.240024] Oops: 0000 [#1] SMP 
[182902.243672] Modules linked in: xt_nat xt_multiport ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_limit xt_addrtype xt_conntrack binfmt_misc veth ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_CHECKSUM xt_comment xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables x_tables zfs(POE) zunicode(PO) zcommon(PO) znvpair(POE) spl(OE) zavl(POE) ppdev parport_pc parport input_leds pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse nvme virtio_scsi
[182902.341683] CPU: 10 PID: 3147 Comm: kworker/10:1H Tainted: P           OE   4.4.0-87-generic #110-Ubuntu
[182902.351344] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[182902.360756] Workqueue: kblockd blk_mq_timeout_work
[182902.365845] task: ffff8819c3039c00 ti: ffff881995e40000 task.ti: ffff881995e40000
[182902.373509] RIP: 0010:[<ffffffff813d4718>]  [<ffffffff813d4718>] blk_mq_check_expired+0x18/0x80
[182902.382503] RSP: 0018:ffff881995e43d40  EFLAGS: 00010246
[182902.387995] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000000
[182902.395492] RDX: ffff881995e43de0 RSI: ffff8819c48fc000 RDI: ffff8819c48fc000
[182902.402807] RBP: ffff881995e43d40 R08: 0000000000000020 R09: ffff8819c5af3040
[182902.410123] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[182902.417436] R13: ffff8819c445c250 R14: ffff8819c4884000 R15: ffff8819c5af3040
[182902.424759] FS:  0000000000000000(0000) GS:ffff8819d7480000(0000) knlGS:0000000000000000
[182902.433038] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[182902.438972] CR2: 00000000000001a0 CR3: 000000040c6e9000 CR4: 00000000001406e0
[182902.446287] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[182902.453602] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[182902.460917] Stack:
[182902.463113]  ffff881995e43d90 ffffffff813d7348 ffff881995e43de0 ffffffff813d4700
[182902.471144]  0000002000000000 ffff8819c445c240 0000000000000000 ffff881995e43de0
[182902.479161]  ffffffff813d4700 ffff8819c4780000 ffff881995e43dd0 ffffffff813d7b57
[182902.487178] Call Trace:
[182902.489831]  [<ffffffff813d7348>] bt_for_each+0xd8/0xe0
[182902.495241]  [<ffffffff813d4700>] ? blk_mq_rq_timed_out+0x70/0x70
[182902.501516]  [<ffffffff813d4700>] ? blk_mq_rq_timed_out+0x70/0x70
[182902.507791]  [<ffffffff813d7b57>] blk_mq_queue_tag_busy_iter+0x47/0xc0
[182902.514500]  [<ffffffff813d341c>] blk_mq_timeout_work+0xdc/0x180
[182902.520691]  [<ffffffff8109a5e5>] process_one_work+0x165/0x480
[182902.526706]  [<ffffffff8109a94b>] worker_thread+0x4b/0x4c0
[182902.532377]  [<ffffffff8109a900>] ? process_one_work+0x480/0x480
[182902.538575]  [<ffffffff810a0c85>] kthread+0xe5/0x100
[182902.543722]  [<ffffffff810a0ba0>] ? kthread_create_on_node+0x1e0/0x1e0
[182902.550433]  [<ffffffff8184224f>] ret_from_fork+0x3f/0x70
[182902.556013]  [<ffffffff810a0ba0>] ? kthread_create_on_node+0x1e0/0x1e0
[182902.562728] Code: ed 48 89 df e8 0a fb ff ff 5b 5d eb e1 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 46 50 48 89 f7 48 89 e5 a8 02 75 11 48 8b 46 30 <48> 8b 80 a0 01 00 00 a8 20 75 2d 5d c3 f6 46 44 04 75 f8 48 8b 
[182902.588849] RIP  [<ffffffff813d4718>] blk_mq_check_expired+0x18/0x80
[182902.595495]  RSP <ffff881995e43d40>
[182902.599165] CR2: 00000000000001a0
[182902.602982] ---[ end trace 47e3ce3b0102baa6 ]---
Jul 29 16:15:01 [182902.608441] BUG: unable to handle kernel paging request at ffffffffffffffd8
[182902.616946] IP: [<ffffffff810a1330>] kthread_data+0x10/0x20
[182902.622834] PGD 1e0d067 PUD 1e0f067 PMD 0 
[182902.627451] Oops: 0000 [#2] SMP 
[182902.631083] Modules linked in: xt_nat xt_multiport ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_limit xt_addrtype xt_conntrack binfmt_misc veth ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_CHECKSUM xt_comment xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables x_tables zfs(POE) zunicode(PO) zcommon(PO) znvpair(POE) spl(OE) zavl(POE) ppdev parport_pc parport input_leds pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recovk ernel: [182902async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath.222705] BUG: unlinear crct10dif_pclmul crc32_pclmul ghash_clmulni_intelb le to handle kaesni_intel aes_x86_64 lrw gf128mulernel NULL pointglue_helper ablk_helper cryptd psmouse nvme virtio_scsi
[182902.735696] CPU: 10 PID: 3147 Comm: kworker/10:1H Tainted: P      D    OE   4.4.0-87-generic #110-Ubuntu
[r dereference a182902.745373] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[182902.756090] task: ffff8819c3039c00 ti: ffff881995e40000 task.ti: ffff881995e40000
t 00000000000001[182902.763763] RIP: 0010:[<ffffffff810a1330>]  [<ffffffff810a1330>] kthread_data+0x10/0x20
[182902.773472] RSP: 0018:ffff881995e43a38  EFLAGS: 00010002
[182902.778978] RAX: 0000000000000000 RBX: 000000000000000a RCX: ffffffff82109e80
[182902.786300] RDX: 000000000000000a RSI: 000000000000000a RDI: ffff8819c3039c00
[182902.793622] RBP: ffff881995e43a38 R08: 00000000ffffffff R09: 0000000000000000
[182902.800947] R10: ffff8819c3039c60 R11: 0000000000004800 R12: 0000000000000000
[182902.808276] R13: 0000000000016dc0 R14: ffff8819c3039c00 R15: ffff8819d7496dc0
[182902.815606] FS:  0000000000000000(0000) GS:ffff8819d7480000(0000) knlGS:0000000000000000
[182902.823884] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[182902.829816] CR2: 0000000000000028 CR3: 0000000001e0a000 CR4: 00000000001406e0
[182902.837139] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[182902.844463] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[182902.851787] Stack:
[182902.853992]  ffff881995e43a50 ffffffff8109b981 ffff8819d7496dc0 ffff881995e43aa0
[182902.862030]  ffffffff8183d990 ffff881995e43ab8 ffffffff0000000a ffff8819c3039c00
[182902.870082]  ffff881995e44000 ffff8819c303a2d0 ffff881995e43650 0000000000000046
[182902.878105] Call Trace:
[182902.880744]  [<ffffffff8109b981>] wq_worker_sleeping+0x11/0x90
[182902.886772]  [<ffffffff8183d990>] __schedule+0x650/0xa30
[182902.892270]  [<ffffffff8183dda5>] schedule+0x35/0x80
[182902.897428]  [<ffffffff81084485>] do_exit+0x775/0xb00
[182902.902674]  [<ffffffff81031c41>] oops_end+0xa1/0xd0
[182902.907830]  [<ffffffff8106ad85>] no_context+0x135/0x380
[182902.913331]  [<ffffffff8106b050>] __bad_area_nosemaphore+0x80/0x1f0
[182902.919787]  [<ffffffff8106b1d3>] bad_area_nosemaphore+0x13/0x20
[182902.925986]  [<ffffffff8106b497>] __do_page_fault+0xb7/0x400
[182902.931840]  [<ffffffff8106b802>] do_page_fault+0x22/0x30
[182902.937430]  [<ffffffff81844038>] page_fault+0x28/0x30
[182902.942766]  [<ffffffff813d4718>] ? blk_mq_check_expired+0x18/0x80
[182902.949147]  [<ffffffff813d7348>] bt_for_each+0xd8/0xe0
[182902.954576]  [<ffffffff813d4700>] ? blk_mq_rq_timed_out+0x70/0x70
[182902.960870]  [<ffffffff813d4700>] ? blk_mq_rq_timed_out+0x70/0x70
[182902.967161]  [<ffffffff813d7b57>] blk_mq_queue_tag_busy_iter+0x47/0xc0
[182902.973882]  [<ffffffff813d341c>] blk_mq_timeout_work+0xdc/0x180
[182902.980088]  [<ffffffff8109a5e5>] process_one_work+0x165/0x480
[182902.986110]  [<ffffffff8109a94b>] worker_thread+0x4b/0x4c0
[182902.991786]  [<ffffffff8109a900>] ? process_one_work+0x480/0x480
[182902.997988]  [<ffffffff810a0c85>] kthread+0xe5/0x100
[182903.003148]  [<ffffffff810a0ba0>] ? kthread_create_on_node+0x1e0/0x1e0
[182903.009871]  [<ffffffff8184224f>] ret_from_fork+0x3f/0x70
[182903.015469]  [<ffffffff810a0ba0>] ? kthread_create_on_node+0x1e0/0x1e0
[182903.022190] Code: ff ff ff be 49 02 00 00 48 c7 c7 60 b8 cb 81 e8 47 01 fe ff e9 a6 fe ff ff 66 90 0f 1f 44 00 00 48 8b 87 18 05 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 
[182903.048610] RIP  [<ffffffff810a1330>] kthread_data+0x10/0x20
[182903.054571]  RSP <ffff881995e43a38>
[182903.058247] CR2: ffffffffffffffd8
[182903.061752] ---[ end trace 47e3ce3b0102baa7 ]---
[182903.066558] Fixing recursive fault but reboot is needed!

kernelOfTruth commented 7 years ago

Related ?

https://lkml.org/lkml/2017/3/9/401 [PATCH 1/2] blk-mq: don't complete un-started request in timeout handler

https://patchwork.kernel.org/patch/9637993/

https://patchwork.kernel.org/patch/9630185/

arturpzol commented 7 years ago

Looks like similar call trace I experienced on 0.7.1-1 with kernel 3.10 and 4.4.

3.10.103:

[1196.868235] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 1196.868244] IP: [] buf_hash_remove+0x73/0xd0 [zfs]
[ 1196.868502] PGD 62aa5d067 PUD 23208e067 PMD 0 
[ 1196.868506] Oops: 0000 [#1] SMP 
[ 1196.868508] Modules linked in: iscsi_scst(O) scst_vdisk(O) scst(O) mperf target_core_iblock target_core_pscsi iscsi_target_mod target_core_mod bonding iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) crc32c_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd vmxnet3(O) processor ac button nls_cp437 nls_iso8859_1 pvscsi(O) sg pata_acpi ata_generic vfat fat aufs
[ 1196.868535] CPU: 1 PID: 6890 Comm: zpool Tainted: P O 3.10.103 #2
[ 1196.868538] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
[ 1196.868540] task: ffff880358e32260 ti: ffff88045bdb8000 task.ti: ffff88045bdb8000
[ 1196.868542] RIP: 0010:[] [] buf_hash_remove+0x73/0xd0 [zfs]
[ 1196.868574] RSP: 0018:ffff88045bdb9cb8 EFLAGS: 00010203
[ 1196.868576] RAX: 000000000031e38b RBX: 0000000000000000 RCX: 0000000000000000
[ 1196.868577] RDX: ffffc9000e35bc58 RSI: ffff880514700940 RDI: ffff880514700930
[ 1196.868579] RBP: ffffffffa0687180 R08: 00f5afe11fa43f17 R09: 00f5afe11fa43f17
[ 1196.868580] R10: ffffffffa04a46b2 R11: ffffea0019294080 R12: ffff880514700930
[ 1196.868581] R13: 0000000000000001 R14: ffffffffa0687100 R15: 0000000000000001
[ 1196.868584] FS: 00007ffa457aa780(0000) GS:ffff88080fc80000(0000) knlGS:0000000000000000
[ 1196.868585] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1196.868587] CR2: 0000000000000020 CR3: 00000005e0bb1000 CR4: 00000000000007e0
[ 1196.868635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1196.868657] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1196.868658] Stack:
[ 1196.868660] ffffffffa049fdb4 000000000000e2d0 0000000000020000 ffff880514700930
[ 1196.868662] ffff88022d765650 ffff8806a5cc7f80 0000000000000006 ffffffffa04a46aa
[ 1196.868665] 00000000000c0000 ffffffffffffffff 000000000000e2d0 000000000000038b
[ 1196.868667] Call Trace:
[ 1196.868705] [] ? arc_change_state.isra.34+0x324/0x400 [zfs]
[ 1196.868737] [] ? arc_evict_state+0x70a/0x870 [zfs]
[ 1196.868770] [] ? arc_flush_state+0x35/0x60 [zfs]
[ 1196.868803] [] ? arc_flush+0x81/0xd0 [zfs]
[ 1196.868848] [] ? dsl_pool_close+0xb9/0x160 [zfs]
[ 1196.868899] [] ? spa_unload+0x158/0x2f0 [zfs]
[ 1196.868948] [] ? spa_export_common+0x2a4/0x480 [zfs]
[ 1196.869000] [] ? zfsdev_ioctl+0x5da/0x680 [zfs]
[ 1196.869007] [] ? do_vfs_ioctl+0x2d5/0x4b0
[ 1196.869014] [] ? vtime_account_user.part.3+0x38/0x50
[ 1196.869017] [] ? SyS_ioctl+0x81/0xa0
[ 1196.869022] [] ? tracesys+0xdd/0xe2
[ 1196.869023] Code: 93 16 00 4d 31 c8 4c 31 c0 48 23 05 d8 93 16 00 48 8d 14 c2 48 8b 0a 48 39 f9 75 0f eb 1a 66 2e 0f 1f 84 00 00 00 00 00 48 89 d1 <48> 8b 51 20 48 39 fa 75 f4 48 8d 51 20 48 8b 4f 20 48 89 0a 48 
[ 1196.869045] RIP [] buf_hash_remove+0x73/0xd0 [zfs]
[ 1196.869077] RSP 
[ 1196.869078] CR2: 0000000000000020
[ 1196.869081] ---[ end trace 52fba4e8df85a463 ]---

4.4.45:


[ 751.378717] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 751.378726] IP: [] buf_hash_remove+0x5f/0xb0 [zfs]
[ 751.378846] PGD 0 
[ 751.378849] Oops: 0000 [#1] SMP 
[ 751.378853] Modules linked in: sha256_generic drbg iscsi_scst(O) scst_vdisk(O) scst(O) target_core_iblock target_core_pscsi iscsi_target_mod target_core_mod bonding ib_iser iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) crc32c_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd vmxnet3(O) acpi_cpufreq ac processor button nls_cp437 nls_iso8859_1 pvscsi(O) sg pata_acpi ata_generic vfat fat aufs
[ 751.378894] CPU: 1 PID: 6635 Comm: arc_reclaim Tainted: P O 4.4.45 #13
[ 751.378898] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
[ 751.378901] task: ffff8807eaf54ec0 ti: ffff8807d1854000 task.ti: ffff8807d1854000
[ 751.378904] RIP: 0010:[] [] buf_hash_remove+0x5f/0xb0 [zfs]
[ 751.378958] RSP: 0018:ffff8807d1857d28 EFLAGS: 00010203
[ 751.378960] RAX: 000000000038785a RBX: 0000000000000000 RCX: 0000000000000000
[ 751.378962] RDX: 0000000000000000 RSI: ffff88078333d660 RDI: ffff88078333d650
[ 751.378965] RBP: ffffffffa0b7ca80 R08: 00eb5952e0ae6803 R09: 00eb5952e0aea201
[ 751.378967] R10: 0000000000000000 R11: ffff88065b3e9e30 R12: ffff88078333d650
[ 751.378969] R13: 0000000000000001 R14: ffffffffa0b7ca00 R15: 0000000000000001
[ 751.378973] FS: 0000000000000000(0000) GS:ffff88080fc80000(0000) knlGS:0000000000000000
[ 751.378975] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 751.378977] CR2: 0000000000000020 CR3: 0000000714418000 CR4: 00000000000006e0
[ 751.379080] Stack:
[ 751.379083] ffffffffa099abaf 0000000000061690 0000000000020000 ffff88078333d650
[ 751.379087] ffff8805ad4bca80 ffff8807ea33b700 0000000000000001 ffffffffa099f17a
[ 751.379091] 0000000000020000 0000000001164e00 0000000000061690 000000000000185a
[ 751.379094] Call Trace:
[ 751.379152] [] ? arc_change_state.isra.34+0x30f/0x3b0 [zfs]
[ 751.379205] [] ? arc_evict_state+0x6ea/0x850 [zfs]
[ 751.379286] [] ? spa_get_random+0x1d/0x40 [zfs]
[ 751.379340] [] ? arc_adjust+0x228/0x470 [zfs]
[ 751.379393] [] ? arc_reclaim_thread+0xa1/0x260 [zfs]
[ 751.379447] [] ? arc_shrink+0xb0/0xb0 [zfs]
[ 751.379467] [] ? thread_generic_wrapper+0x70/0x80 [spl]
[ 751.379475] [] ? __thread_exit+0x10/0x10 [spl]
[ 751.379481] [] ? kthread+0xca/0xe0
[ 751.379485] [] ? kthread_create_on_node+0x170/0x170
[ 751.379489] [] ? ret_from_fork+0x3f/0x70
[ 751.379493] [] ? kthread_create_on_node+0x170/0x170
[ 751.379495] Code: 75 e0 49 c1 e9 08 48 8b 15 e7 3d 16 00 4d 31 c8 4c 31 c0 48 23 05 d2 3d 16 00 48 8d 14 c2 48 8b 0a 48 39 f9 75 05 eb 10 48 89 d1 <48> 8b 51 20 48 39 fa 75 f4 48 8d 51 20 48 8b 4f 20 48 89 0a 48 
[ 751.379529] RIP [] buf_hash_remove+0x5f/0xb0 [zfs]
[ 751.379581] RSP 
[ 751.379583] CR2: 0000000000000020
[ 751.379587] ---[ end trace e5175d85ab15f72c ]---

I think that I have full scenario which leads to this issue so if any debug is necessary we can collaborate.

arturpzol commented 7 years ago

As I mentioned I had full scenario which leads to this issue (connected with buf_hash_remove) so I enabled debug and experienced below call trace:

[  725.197505] VERIFY(dr->dt.dl.dr_override_state != DR_IN_DMU_SYNC) failed
[  725.197512] PANIC at dbuf.c:1304:dbuf_unoverride()
[  725.197515] Showing stack for process 30425
[  725.197520] CPU: 2 PID: 30425 Comm: zvol Tainted: P           O    4.4.45-oe64-gb9dcf0a #14
[  725.197522] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
[  725.197525]  0000000000000000 ffffffff813309ff ffffffffa06aa3a1 ffff88077370fcb0
[  725.197530]  ffffffffa01fb5d1 ffff880789826700 0000000000000028 ffff88077370fcc0
[  725.197534]  ffff88077370fc60 6428594649524556 6c642e74643e2d72 7265766f5f72642e
[  725.197537] Call Trace:
[  725.197547]  [<ffffffff813309ff>] ? dump_stack+0x5c/0x7d
[  725.197564]  [<ffffffffa01fb5d1>] ? spl_panic+0xb1/0xf0 [spl]
[  725.197577]  [<ffffffffa01fe20d>] ? __cv_destroy+0x4d/0x230 [spl]
[  725.197688]  [<ffffffffa060fd0d>] ? zio_ready+0x1cd/0x610 [zfs]
[  725.197698]  [<ffffffffa01f3dfc>] ? spl_kmem_alloc+0xac/0x1b0 [spl]
[  725.197803]  [<ffffffffa060a8d2>] ? zio_wait+0x212/0x400 [zfs]
[  725.197904]  [<ffffffffa060a8d2>] ? zio_wait+0x212/0x400 [zfs]
[  725.197969]  [<ffffffffa0522a05>] ? dbuf_unoverride+0x155/0x190 [zfs]
[  725.198033]  [<ffffffffa0522bf3>] ? dbuf_redirty+0x33/0xc0 [zfs]
[  725.198099]  [<ffffffffa052b64e>] ? dmu_buf_will_dirty+0x9e/0x1f0 [zfs]
[  725.198169]  [<ffffffffa05373cf>] ? dmu_write_uio_dnode+0x5f/0x1a0 [zfs]
[  725.198270]  [<ffffffffa061cb98>] ? zvol_write+0x178/0x5f0 [zfs]
[  725.198282]  [<ffffffffa01f8c71>] ? taskq_thread+0x281/0x570 [spl]
[  725.198287]  [<ffffffff8107e880>] ? wake_up_q+0x60/0x60
[  725.198297]  [<ffffffffa01f89f0>] ? taskq_thread_should_stop+0x80/0x80 [spl]
[  725.198302]  [<ffffffff81077b7a>] ? kthread+0xca/0xe0
[  725.198306]  [<ffffffff81077ab0>] ? kthread_create_on_node+0x170/0x170
[  725.198311]  [<ffffffff8170ff8f>] ? ret_from_fork+0x3f/0x70
[  725.198314]  [<ffffffff81077ab0>] ? kthread_create_on_node+0x170/0x170

so seems that it is the same as in https://github.com/zfsonlinux/zfs/issues/6238

I tried fix https://github.com/zfsonlinux/zfs/commit/f763c3d1df569a8d6b60bcb5e95cf07aa7a189e6 and issue did not occur again.

bprotopopov commented 6 years ago

Hi, there,

I am looking at a backtrace very similar to several ones listed above, in a build off zfs-0.7.2 on centos 7:

     RELEASE: 3.10.0-327.el7.x86_64
     VERSION: #1 SMP Thu Nov 19 22:10:57 UTC 2015
     MACHINE: x86_64  (2194 Mhz)
      MEMORY: 127.7 GB
       PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000020"
         PID: 160434
     COMMAND: "arc_reclaim"
        TASK: ffff880ff8725080  [THREAD_INFO: ffff881fd9bac000]
         CPU: 9
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 160434  TASK: ffff880ff8725080  CPU: 9   COMMAND: "arc_reclaim"
 #0 [ffff881fd9baf9a0] machine_kexec at ffffffff81051beb
 #1 [ffff881fd9bafa00] crash_kexec at ffffffff810f2542
 #2 [ffff881fd9bafad0] oops_end at ffffffff8163e1a8
 #3 [ffff881fd9bafaf8] no_context at ffffffff8162e2b8
 #4 [ffff881fd9bafb48] __bad_area_nosemaphore at ffffffff8162e34e
 #5 [ffff881fd9bafb90] bad_area_nosemaphore at ffffffff8162e4b8
 #6 [ffff881fd9bafba0] __do_page_fault at ffffffff81640fce
 #7 [ffff881fd9bafbf8] do_page_fault at ffffffff81641113
 #8 [ffff881fd9bafc20] page_fault at ffffffff8163d408
    [exception RIP: buf_hash_remove+115]
    RIP: ffffffffa0795f63  RSP: ffff881fd9bafcd0  RFLAGS: 00010203
    RAX: ffffc900228c6988  RBX: ffff881df65f7090  RCX: 00000000002b3531
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff881df65f7090
    RBP: ffff881fd9bafcd0   R8: e620f306bee9ad46   R9: ffffffffa0477329
    R10: ffffffffa047752d  R11: ffffea0020bf6900  R12: ffff880ff8f47bc0
    R13: ffff881945f047e0  R14: ffff88201da04480  R15: ffff8820110f3200
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff881fd9bafcd8] arc_hdr_realloc at ffffffffa0797b7c [zfs]
#10 [ffff881fd9bafd18] arc_evict_state at ffffffffa079cc11 [zfs]
#11 [ffff881fd9bafdf0] arc_adjust_impl.constprop.34 at ffffffffa079d007 [zfs]
#12 [ffff881fd9bafe00] arc_adjust at ffffffffa079d25b [zfs]
#13 [ffff881fd9bafe48] arc_reclaim_thread at ffffffffa079e205 [zfs]
#14 [ffff881fd9bafe98] thread_generic_wrapper at ffffffffa0477fa1 [spl]
#15 [ffff881fd9bafec8] kthread at ffffffff810a5aef
#16 [ffff881fd9baff50] ret_from_fork at ffffffff81645858

The pool configuration is as follows:

 state: ONLINE
  scan: none requested
config:

    NAME                        STATE     READ WRITE CKSUM
    upool                       ONLINE       0     0     0
      raidz1-0                  ONLINE       0     0     0
        wwn-0x5002538c407082b3  ONLINE       0     0     0
        wwn-0x82a84070538c5002  ONLINE       0     0     0
        wwn-0x5002538c40708002  ONLINE       0     0     0
        wwn-0x82404070538c5002  ONLINE       0     0     0
    cache
      nvme-S39YNX0J701360       ONLINE       0     0     0

It is not full:

NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
pool   6.94T   733G  6.22T         -    11%    10%  1.00x  ONLINE  -
  raidz1  6.94T   733G  6.22T         -    11%    10%
    wwn-0x5002538c407082b3      -      -      -         -      -      -
    wwn-0x82a84070538c5002      -      -      -         -      -      -
    wwn-0x5002538c40708002      -      -      -         -      -      -
    wwn-0x82404070538c5002      -      -      -         -      -      -
cache      -      -      -         -      -      -
  nvme-S39YNX0J701360   745G   272G   473G         -     0%    36%

The pool backing disks are Samsumg MZ7LM1T9HMJP0D3 1.9T SSDs, the L2ARC is an NVME S39YNX0J701360 Dell Express Flash PM1725a 800GB SFF.

I see this fairly regularly with a VM running on KVM/qemu with two ZVOLs attached as raw disks with 'none' caching policy:

...
<disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/zvol/pool/vol0'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      ...
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/zvol/pool/vol1'/>
      <backingStore/>
      <target dev='vdc' bus='virtio'/>
      ...
    </disk>
...

The memory is multi-bit ECC. I'll enable debugging and post more info.

Does anyone know when this problem has been introduced ?

bprotopopov commented 6 years ago

P.S. The fix f763c3d is included in the build (included in 0.7.2).

bprotopopov commented 6 years ago

Actually, with more testing, it looks like f763c3d has addressed the issue.

behlendorf commented 6 years ago

@bprotopopov thanks for confirming this. Shall we close this then?

bprotopopov commented 6 years ago

@behlendorf my stack trace looks very similar to o some of the above stack traces and to the top part of the traces at illumos https://www.illumos.org/issues/8648. The test provided by @loli10K for https://www.illumos.org/issues/8648 that reproduces the issue, passes, as well as my workload, so I feel it's reasonable to say this is addressed.