openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.69k stars 1.76k forks source link

dmu_buf_get_blkptr unable to handle kernel NULL pointer dereference at 0000000000000038 #4737

Closed JuliaVixen closed 5 years ago

JuliaVixen commented 8 years ago

Well, this just happened, and now all writes to this filesystem block forever. I don't know what happened, I had one process writing to this filesystem over NFS, and I was reading from the same filesystem, copying some files to a local USB memory stick.

Reads still work fine, but all writes, anywhere in the pool, all block. (I have atime turned off on this whole pool, so reads shouldn't be causing writes anyway.)

[1110464.812920] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
[1110464.812973] IP: [<ffffffffa0b934ef>] dmu_buf_get_blkptr+0x71/0x8b [zfs]
[1110464.813026] PGD 29ee05067 PUD 7b37ec067 PMD 0 
[1110464.813066] Oops: 0000 [#1] SMP 
[1110464.813100] Modules linked in: hfsplus nfsv3 nfsd ipv6 binfmt_misc x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass pcspkr i2c_i801 i2c_core zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) zlib_deflate joydev input_leds e1000e acpi_cpufreq thermal fan video backlight battery processor button xts gf128mul aes_x86_64 cbc sha512_generic sha256_generic sha1_generic libiscsi scsi_transport_iscsi tg3 libphy e1000 fuse nfs lockd grace sunrpc jfs multipath linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey hid_microsoft hid_logitech hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin
[1110464.813713]  hid_apple hid_a4tech sl811_hcd xhci_plat_hcd ohci_pci ohci_hcd uhci_hcd aic94xx libsas lpfc crc_t10dif crct10dif_common qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc scsi_transport_fc mptspi mptscsih mptbase atp870u dc395x qla1280 imm parport dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_via pata_jmicron pata_marvell pata_sis pata_netcell pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks
[1110464.814351]  pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix led_class usbhid usb_storage xhci_pci ahci libahci ehci_pci xhci_hcd ehci_hcd ptp usbcore pps_core usb_common libata
[1110464.814641] CPU: 1 PID: 31766 Comm: z_wr_int_5 Tainted: P           O    4.4.6-gentoo #1
[1110464.814683] Hardware name: Supermicro X10SLL-F/X10SLL-SF/X10SLL-F/X10SLL-SF, BIOS 1.0a 06/11/2013
[1110464.819179] task: ffff88068e0a30c0 ti: ffff8805184fc000 task.ti: ffff8805184fc000
[1110464.819216] RIP: 0010:[<ffffffffa0b934ef>]  [<ffffffffa0b934ef>] dmu_buf_get_blkptr+0x71/0x8b [zfs]
[1110464.819281] RSP: 0018:ffff8805184ffd18  EFLAGS: 00010246
[1110464.819312] RAX: 0000000000000000 RBX: ffff8801f4f4a048 RCX: 0000000000b580cc
[1110464.819349] RDX: 0000000000000000 RSI: ffff880184c2b570 RDI: ffff88056beee800
[1110464.819384] RBP: ffff8805184ffd28 R08: 0000000000000010 R09: 0000000000000010
[1110464.819420] R10: 00000000005294c3 R11: ffff8802c0c78330 R12: ffff8806b14c08a8
[1110464.819456] R13: 0000000000000100 R14: 0000000000000000 R15: ffff880664fa24d0
[1110464.819492] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
[1110464.819530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1110464.819563] CR2: 0000000000000038 CR3: 000000061c849000 CR4: 00000000001406e0
[1110464.819599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1110464.819635] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1110464.819670] Stack:
[1110464.819695]  ffff8801f9d3b570 ffff8807fb008000 ffff8805184ffd38 ffffffffa0b8a189
[1110464.819745]  ffff8805184ffd68 ffffffffa0c18bc4 00000000005294c3 ffff8801f9d3b570
[1110464.819794]  0000000000080000 ffff88068e0a30c0 ffff8805184ffde8 ffffffffa0c1d7bb
[1110464.819844] Call Trace:
[1110464.819875]  [<ffffffffa0b8a189>] arc_buf_size+0xd3/0x77d [zfs]
[1110464.819915]  [<ffffffffa0c18bc4>] zio_buf_free+0x3d9/0x8cc [zfs]
[1110464.819954]  [<ffffffffa0c1d7bb>] zio_worst_error+0xd66/0xe73 [zfs]
[1110464.819996]  [<ffffffff814d1258>] ? mutex_unlock+0x9/0xb
[1110464.820034]  [<ffffffffa0c1852a>] ? zil_fini+0x2ec/0x553 [zfs]
[1110464.820073]  [<ffffffffa0c19b5b>] zio_execute+0x99/0xcd [zfs]
[1110464.820107]  [<ffffffffa0a2d0d8>] taskq_create+0x54c/0x6a5 [spl]
[1110464.820141]  [<ffffffff8105a3f1>] ? try_to_wake_up+0x22f/0x22f
[1110464.820176]  [<ffffffffa0a2ceac>] ? taskq_create+0x320/0x6a5 [spl]
[1110464.820210]  [<ffffffff81055390>] kthread+0xcd/0xd5
[1110464.820241]  [<ffffffff810552c3>] ? kthread_freezable_should_stop+0x43/0x43
[1110464.820277]  [<ffffffff814d2e9f>] ret_from_fork+0x3f/0x70
[1110464.820310]  [<ffffffff810552c3>] ? kthread_freezable_should_stop+0x43/0x43
[1110464.820345] Code: 41 5c 5d c3 55 48 89 e5 41 54 49 89 d4 53 48 89 fb 48 8b 7a 20 e8 83 ce 00 00 48 89 c7 49 8b 84 24 f0 00 00 00 31 d2 48 8b 4b 50 <8b> 40 38 48 f7 b3 f0 02 00 00 48 89 ca 48 63 f0 e8 7a 78 02 00 
[1110464.820650] RIP  [<ffffffffa0b934ef>] dmu_buf_get_blkptr+0x71/0x8b [zfs]
[1110464.820691]  RSP <ffff8805184ffd18>
[1110464.820718] CR2: 0000000000000038
[1110464.821206] ---[ end trace aa4bddb464bff7e9 ]---

This is the Gentoo ZFS-9999 package, the exact same one as in https://github.com/zfsonlinux/zfs/issues/4582#issuecomment-220190326

localhost ~ # modinfo zfs | head
filename:       /lib/modules/4.4.6-gentoo/extra/zfs/zfs.ko
version:        0.6.5-281_gbc2d809
license:        CDDL
author:         OpenZFS on Linux
description:    ZFS
srcversion:     7B91AC3C57C8823BFB45845
depends:        spl,znvpair,zunicode,zcommon,zavl
vermagic:       4.4.6-gentoo SMP mod_unload modversions 
parm:           zvol_inhibit_dev:Do not create zvol device nodes (uint)
parm:           zvol_major:Major number for zvol device (uint)
localhost ~ # modinfo spl | head
filename:       /lib/modules/4.4.6-gentoo/extra/spl/spl.ko
version:        0.6.5-54_g39cd90e
license:        GPL
author:         OpenZFS on Linux
description:    Solaris Porting Layer
srcversion:     F46FA65506ED2842A5834D8
depends:        zlib_deflate
vermagic:       4.4.6-gentoo SMP mod_unload modversions 
parm:           spl_hostid:The system hostid. (ulong)
parm:           spl_hostid_path:The system hostid file (/etc/hostid) (charp)
JuliaVixen commented 8 years ago

P.S. I do have ECC memory in this computer.

tuxoko commented 8 years ago

@JuliaVixen Please do objdump -d zfs.ko, find dmu_buf_get_blkptr and post it. You need to post what ever covered under the range of dmu_buf_get_blkptr+0x8b

JuliaVixen commented 8 years ago

This exact same bug just occurred again, under similar conditions as the first time. So, I guess it's reproducible. Here's the dump, 0x947E plus 0x71 is 94ef: 8b 40 38 mov 0x38(%rax),%eax which makes sense if RAX is zero.

000000000000947e <dmu_buf_get_blkptr>:
    947e:       55                      push   %rbp
    947f:       48 8b 47 48             mov    0x48(%rdi),%rax
    9483:       48 89 e5                mov    %rsp,%rbp
    9486:       5d                      pop    %rbp
    9487:       c3                      retq   
    9488:       55                      push   %rbp
    9489:       48 89 e5                mov    %rsp,%rbp
    948c:       41 54                   push   %r12
    948e:       49 89 fc                mov    %rdi,%r12
    9491:       49 89 b4 24 a0 00 00    mov    %rsi,0xa0(%r12)
    9498:       00 
    9499:       53                      push   %rbx
    949a:       48 8b 46 48             mov    0x48(%rsi),%rax
    949e:       48 89 f3                mov    %rsi,%rbx
    94a1:       48 89 47 18             mov    %rax,0x18(%rdi)
    94a5:       48 89 f7                mov    %rsi,%rdi
    94a8:       e8 00 00 00 00          callq  94ad <dmu_buf_get_blkptr+0x2f>
    94ad:       85 c0                   test   %eax,%eax
    94af:       75 12                   jne    94c3 <dmu_buf_get_blkptr+0x45>
    94b1:       4c 89 e2                mov    %r12,%rdx
    94b4:       48 c7 c6 00 00 00 00    mov    $0x0,%rsi
    94bb:       48 89 df                mov    %rbx,%rdi
    94be:       e8 00 00 00 00          callq  94c3 <dmu_buf_get_blkptr+0x45>
    94c3:       5b                      pop    %rbx
    94c4:       41 5c                   pop    %r12
    94c6:       5d                      pop    %rbp
    94c7:       c3                      retq   
    94c8:       55                      push   %rbp
    94c9:       48 89 e5                mov    %rsp,%rbp
    94cc:       41 54                   push   %r12
    94ce:       49 89 d4                mov    %rdx,%r12
    94d1:       53                      push   %rbx
    94d2:       48 89 fb                mov    %rdi,%rbx
    94d5:       48 8b 7a 20             mov    0x20(%rdx),%rdi
    94d9:       e8 00 00 00 00          callq  94de <dmu_buf_get_blkptr+0x60>
    94de:       48 89 c7                mov    %rax,%rdi
    94e1:       49 8b 84 24 f0 00 00    mov    0xf0(%r12),%rax
    94e8:       00 
    94e9:       31 d2                   xor    %edx,%edx
    94eb:       48 8b 4b 50             mov    0x50(%rbx),%rcx
    94ef:       8b 40 38                mov    0x38(%rax),%eax
    94f2:       48 f7 b3 f0 02 00 00    divq   0x2f0(%rbx)
    94f9:       48 89 ca                mov    %rcx,%rdx
    94fc:       48 63 f0                movslq %eax,%rsi
    94ff:       e8 00 00 00 00          callq  9504 <dmu_buf_get_blkptr+0x86>
    9504:       5b                      pop    %rbx
    9505:       41 5c                   pop    %r12
    9507:       5d                      pop    %rbp
    9508:       c3                      retq   

Here is the quite nearly identical crash dump from a few minutes ago....

[177048.712082] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
[177048.712146] IP: [<ffffffffa10714ef>] dmu_buf_get_blkptr+0x71/0x8b [zfs]
[177048.712192] PGD a441d067 PUD a9097067 PMD 0 
[177048.712229] Oops: 0000 [#1] SMP 
[177048.712261] Modules linked in: nfsd ipv6 binfmt_misc zfs(PO) zunicode(PO) x86_pkg_temp_thermal zcommon(PO) coretemp znvpair(PO) kvm_intel spl(O) zavl(PO) zlib_deflate kvm irqbypass input_leds i2c_i801 joydev i2c_core pcspkr e1000e acpi_cpufreq thermal video fan backlight battery processor button xts gf128mul aes_x86_64 cbc sha512_generic sha256_generic sha1_generic libiscsi scsi_transport_iscsi tg3 libphy e1000 fuse nfs lockd grace sunrpc jfs multipath linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey hid_microsoft hid_logitech hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech
[177048.712835]  sl811_hcd xhci_plat_hcd ohci_pci ohci_hcd uhci_hcd aic94xx libsas lpfc crc_t10dif crct10dif_common qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc scsi_transport_fc mptspi mptscsih mptbase atp870u dc395x qla1280 imm parport dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_via pata_jmicron pata_marvell pata_sis pata_netcell pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_artop pata_it821x
[177048.713380]  pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix led_class usbhid usb_storage xhci_pci ehci_pci xhci_hcd ahci ehci_hcd usbcore ptp libahci pps_core usb_common libata
[177048.713612] CPU: 1 PID: 4766 Comm: z_wr_int_0 Tainted: P           O    4.4.6-gentoo #1
[177048.713649] Hardware name: Supermicro X10SLL-F/X10SLL-SF/X10SLL-F/X10SLL-SF, BIOS 1.0a 06/11/2013
[177048.713706] task: ffff880805d7e800 ti: ffff88080a8cc000 task.ti: ffff88080a8cc000
[177048.713741] RIP: 0010:[<ffffffffa10714ef>]  [<ffffffffa10714ef>] dmu_buf_get_blkptr+0x71/0x8b [zfs]
[177048.713803] RSP: 0018:ffff88080a8cfd18  EFLAGS: 00010246
[177048.713833] RAX: 0000000000000000 RBX: ffff8807084c2d60 RCX: 0000000000b60c0d
[177048.713867] RDX: 0000000000000000 RSI: ffff880787531630 RDI: ffff8800c6c12800
[177048.713902] RBP: ffff88080a8cfd28 R08: 0000000000000010 R09: 0000000000000010
[177048.713936] R10: 00000000005294ac R11: ffff8804c453f030 R12: ffff8807c50362e0
[177048.713970] R13: 0000000000000100 R14: 0000000000000000 R15: ffff880076c48de0
[177048.714005] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
[177048.714041] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[177048.714072] CR2: 0000000000000038 CR3: 000000009ad53000 CR4: 00000000001406e0
[177048.714107] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[177048.714141] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[177048.714175] Stack:
[177048.714200]  ffff8806f8039ab8 ffff880809db8000 ffff88080a8cfd38 ffffffffa1068189
[177048.714246]  ffff88080a8cfd68 ffffffffa10f6bc4 00000000005294ac ffff8806f8039ab8
[177048.714292]  0000000000080000 ffff880805d7e800 ffff88080a8cfde8 ffffffffa10fb7bb
[177048.714338] Call Trace:
[177048.714366]  [<ffffffffa1068189>] arc_buf_size+0xd3/0x77d [zfs]
[177048.714404]  [<ffffffffa10f6bc4>] zio_buf_free+0x3d9/0x8cc [zfs]
[177048.714443]  [<ffffffffa10fb7bb>] zio_worst_error+0xd66/0xe73 [zfs]
[177048.714478]  [<ffffffff814d1258>] ? mutex_unlock+0x9/0xb
[177048.714514]  [<ffffffffa10f652a>] ? zil_fini+0x2ec/0x553 [zfs]
[177048.714552]  [<ffffffffa10f7b5b>] zio_execute+0x99/0xcd [zfs]
[177048.714585]  [<ffffffffa0aec0d8>] taskq_create+0x54c/0x6a5 [spl]
[177048.714619]  [<ffffffff8105a3f1>] ? try_to_wake_up+0x22f/0x22f
[177048.714651]  [<ffffffffa0aebeac>] ? taskq_create+0x320/0x6a5 [spl]
[177048.714685]  [<ffffffff81055390>] kthread+0xcd/0xd5
[177048.714715]  [<ffffffff810552c3>] ? kthread_freezable_should_stop+0x43/0x43
[177048.714749]  [<ffffffff814d2e9f>] ret_from_fork+0x3f/0x70
[177048.714780]  [<ffffffff810552c3>] ? kthread_freezable_should_stop+0x43/0x43
[177048.714814] Code: 41 5c 5d c3 55 48 89 e5 41 54 49 89 d4 53 48 89 fb 48 8b 7a 20 e8 83 ce 00 00 48 89 c7 49 8b 84 24 f0 00 00 00 31 d2 48 8b 4b 50 <8b> 40 38 48 f7 b3 f0 02 00 00 48 89 ca 48 63 f0 e8 7a 78 02 00 
[177048.715099] RIP  [<ffffffffa10714ef>] dmu_buf_get_blkptr+0x71/0x8b [zfs]
[177048.715140]  RSP <ffff88080a8cfd18>
[177048.715165] CR2: 0000000000000038
[177048.715636] ---[ end trace e1f1622cce51eb6d ]---
JuliaVixen commented 8 years ago

I hadn't really looked at that dump before I posted it. The crash isn't in dmu_buf_get_blkptr() after all, there's some symbols missing... This function...

    94c8:       55                      push   rbp
    94c9:       48 89 e5                mov    rbp,rsp
    94cc:       41 54                   push   r12
    94ce:       49 89 d4                mov    r12,rdx
    94d1:       53                      push   rbx
    94d2:       48 89 fb                mov    rbx,rdi
    94d5:       48 8b 7a 20             mov    rdi,QWORD PTR [rdx+0x20]
    94d9:       e8 00 00 00 00          call   94de <dmu_buf_get_blkptr+0x60>   94da: R_X86_64_PC32     dmu_objset_pool-0x4
    94de:       48 89 c7                mov    rdi,rax
    94e1:       49 8b 84 24 f0 00 00 00         mov    rax,QWORD PTR [r12+0xf0]
    94e9:       31 d2                   xor    edx,edx
    94eb:       48 8b 4b 50             mov    rcx,QWORD PTR [rbx+0x50]
    94ef:       8b 40 38                mov    eax,DWORD PTR [rax+0x38]
    94f2:       48 f7 b3 f0 02 00 00    div    QWORD PTR [rbx+0x2f0]
    94f9:       48 89 ca                mov    rdx,rcx
    94fc:       48 63 f0                movsxd rsi,eax
    94ff:       e8 00 00 00 00          call   9504 <dmu_buf_get_blkptr+0x86>   9500: R_X86_64_PC32     dsl_pool_undirty_space-0x4
    9504:       5b                      pop    rbx
    9505:       41 5c                   pop    r12
    9507:       5d                      pop    rbp
    9508:       c3                      ret    

... Which might be in dbuf_undirty(), just guessing. I suppose I could try rebuilding this module with debugging symbols intact this time. Though I did this through Gentoo's weird git emerge process, so it's going to be some work to get the exact same binaries out of it.

JuliaVixen commented 8 years ago

I suppose it might help to see what the last thing on the call stack really was. It's not from arc_buf_size(), that's just another bogus symbol. It looks like a call to a function pointer... whatever structure that 173: 48 8b 48 10 mov rcx,QWORD PTR [rax+0x10] is.

00000000000000b6 <arc_buf_size>:
      b6:       48 8b 07                mov    rax,QWORD PTR [rdi]
      b9:       55                      push   rbp
      ba:       48 89 e5                mov    rbp,rsp
      bd:       48 63 40 2c             movsxd rax,DWORD PTR [rax+0x2c]
      c1:       5d                      pop    rbp
      c2:       c3                      ret    

      c3:       48 8b 0d 00 00 00 00    mov    rcx,QWORD PTR [rip+0x0]        # ca <arc_buf_size+0x14>  c6: R_X86_64_PC32       .bss+0x8098c
      ca:       48 8b 05 00 00 00 00    mov    rax,QWORD PTR [rip+0x0]        # d1 <arc_buf_size+0x1b>  cd: R_X86_64_PC32       .bss+0x80994
      d1:       45 31 c0                xor    r8d,r8d
      d4:       48 8b 35 00 00 00 00    mov    rsi,QWORD PTR [rip+0x0]        # db <arc_buf_size+0x25>  d7: R_X86_64_PC32       .bss+0x80974
      db:       55                      push   rbp
      dc:       48 8b 51 48             mov    rdx,QWORD PTR [rcx+0x48]
      e0:       48 03 51 40             add    rdx,QWORD PTR [rcx+0x40]
      e4:       48 89 e5                mov    rbp,rsp
      e7:       48 8b 0d 00 00 00 00    mov    rcx,QWORD PTR [rip+0x0]        # ee <arc_buf_size+0x38>  ea: R_X86_64_PC32       .bss+0x8097c
      ee:       48 03 50 40             add    rdx,QWORD PTR [rax+0x40]
      f2:       48 03 50 48             add    rdx,QWORD PTR [rax+0x48]
      f6:       48 8b 46 48             mov    rax,QWORD PTR [rsi+0x48]
      fa:       48 03 46 40             add    rax,QWORD PTR [rsi+0x40]
      fe:       48 8b 35 00 00 00 00    mov    rsi,QWORD PTR [rip+0x0]        # 105 <arc_buf_size+0x4f> 101: R_X86_64_PC32      .data+0x60c
     105:       48 03 41 40             add    rax,QWORD PTR [rcx+0x40]
     109:       48 03 41 48             add    rax,QWORD PTR [rcx+0x48]
     10d:       48 8b 0d 00 00 00 00    mov    rcx,QWORD PTR [rip+0x0]        # 114 <arc_buf_size+0x5e> 110: R_X86_64_PC32      .data+0x66c
     114:       48 89 cf                mov    rdi,rcx
     117:       48 29 d7                sub    rdi,rdx
     11a:       49 0f 48 f8             cmovs  rdi,r8
     11e:       48 39 f7                cmp    rdi,rsi
     121:       72 05                   jb     128 <arc_buf_size+0x72>
     123:       48 01 d0                add    rax,rdx
     126:       eb 0a                   jmp    132 <arc_buf_size+0x7c>
     128:       48 29 f1                sub    rcx,rsi
     12b:       49 0f 48 c8             cmovs  rcx,r8
     12f:       48 01 c8                add    rax,rcx
     132:       5d                      pop    rbp
     133:       c3                      ret    

     134:       48 8b 15 00 00 00 00    mov    rdx,QWORD PTR [rip+0x0]        # 13b <arc_buf_size+0x85> 137: R_X86_64_PC32      .data+0x5dc
     13b:       8b 0d 00 00 00 00       mov    ecx,DWORD PTR [rip+0x0]        # 141 <arc_buf_size+0x8b> 13d: R_X86_64_PC32      zfs_arc_overflow_shift-0x4
     141:       55                      push   rbp
     142:       48 89 d0                mov    rax,rdx
     145:       48 89 e5                mov    rbp,rsp
     148:       48 d3 e8                shr    rax,cl
     14b:       b9 00 00 00 01          mov    ecx,0x1000000
     150:       48 3d 00 00 00 01       cmp    rax,0x1000000
     156:       48 0f 42 c1             cmovb  rax,rcx
     15a:       48 01 d0                add    rax,rdx
     15d:       48 39 05 00 00 00 00    cmp    QWORD PTR [rip+0x0],rax        # 164 <arc_buf_size+0xae> 160: R_X86_64_PC32      .data+0x66c
     164:       5d                      pop    rbp
     165:       0f 93 c0                setae  al
     168:       0f b6 c0                movzx  eax,al
     16b:       c3                      ret    

     16c:       48 8b 87 60 01 00 00    mov    rax,QWORD PTR [rdi+0x160]  ; <--- from this other structure
     173:       48 8b 48 10             mov    rcx,QWORD PTR [rax+0x10]   ; <----- that'd be a function pointer
     177:       48 85 c9                test   rcx,rcx
     17a:       74 0e                   je     18a <arc_buf_size+0xd4>
     17c:       55                      push   rbp
     17d:       48 8b 70 20             mov    rsi,QWORD PTR [rax+0x20] ; arg
     181:       48 8b 10                mov    rdx,QWORD PTR [rax] ; arg
     184:       48 89 e5                mov    rbp,rsp
     187:       ff d1                   call   rcx   ; <------- That's the last call on the stack (0xB6+0xD3)
     189:       5d                      pop    rbp
     18a:       c3                      ret    

     18b:       55                      push   rbp
     18c:       ff ce                   dec    esi
     18e:       48 8b 97 80 00 00 00    mov    rdx,QWORD PTR [rdi+0x80]
     195:       b8 0d 00 00 00          mov    eax,0xd
     19a:       48 89 e5                mov    rbp,rsp
etc. etc.
tuxoko commented 8 years ago

The function is dbuf_write_physdone. The NULL pointer is db->db_data_pending

000000000000d470 <dbuf_write_physdone>:
    d470:       e8 00 00 00 00          callq  d475 <dbuf_write_physdone+0x5>
    d475:       55                      push   %rbp
    d476:       48 89 e5                mov    %rsp,%rbp
    d479:       41 54                   push   %r12
    d47b:       53                      push   %rbx
    d47c:       48 89 fb                mov    %rdi,%rbx
    d47f:       48 8b 7a 20             mov    0x20(%rdx),%rdi
    d483:       49 89 d4                mov    %rdx,%r12
    d486:       e8 00 00 00 00          callq  d48b <dbuf_write_physdone+0x1b>
    d48b:       48 89 c7                mov    %rax,%rdi
    d48e:       49 8b 84 24 e8 00 00    mov    0xe8(%r12),%rax
    d495:       00 
    d496:       31 d2                   xor    %edx,%edx
    d498:       48 8b 4b 50             mov    0x50(%rbx),%rcx
    d49c:       8b 40 38                mov    0x38(%rax),%eax
    d49f:       48 f7 b3 f0 02 00 00    divq   0x2f0(%rbx)
    d4a6:       48 89 ca                mov    %rcx,%rdx
    d4a9:       48 63 f0                movslq %eax,%rsi
    d4ac:       e8 00 00 00 00          callq  d4b1 <dbuf_write_physdone+0x41>
    d4b1:       5b                      pop    %rbx
    d4b2:       41 5c                   pop    %r12
    d4b4:       5d                      pop    %rbp
    d4b5:       c3                      retq   
    d4b6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    d4bd:       00 00 00
tuxoko commented 8 years ago

@JuliaVixen Are you still getting this error? If it's easily reproducible, you can try configure zfs with debug enable.