zfsstress: ida_remove called for id=24 which is not allocated.

nedbass commented 9 years ago

The buildbot hit this WARNING and GPF while destroying a snapshot during the zfsstress test.

[ 6289.734169] ------------[ cut here ]------------
[ 6289.734336] WARNING: CPU: 1 PID: 27397 at /build/buildd/linux-3.11.0/lib/idr.c:1065 ida_remove+0xb7/0xf0()
[ 6289.734340] ida_remove called for id=24 which is not allocated.
[ 6289.734343] Modules linked in: zfs(POF) zunicode(POF) zavl(POF) zcommon(POF) znvpair(POF) spl(OF) zlib_deflate aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd qxl psmouse ttm microcode virtio_balloon serio_raw virtio_console drm_kms_helper drm i2c_piix4 mac_hid lp parport ext2 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq virtio_scsi raid1 raid0 floppy multipath linear [last unloaded: spl]
[ 6289.734424] CPU: 1 PID: 27397 Comm: zfs Tainted: PF          O 3.11.0-26-generic #45-Ubuntu
[ 6289.734427] Hardware name: Red Hat RHEV Hypervisor, BIOS 0.5.1 01/01/2007
[ 6289.734431]  0000000000000009 ffff8800503dbcb8 ffffffff816ea0a1 ffff8800503dbd00
[ 6289.734437]  ffff8800503dbcf0 ffffffff81061e2d ffff88007d005300 ffffffff81fa13c0
[ 6289.734442]  000000000000000a 0000000000000b10 ffff88005ef38d28 ffff8800503dbd50
[ 6289.734447] Call Trace:
[ 6289.734515]  [<ffffffff816ea0a1>] dump_stack+0x45/0x56
[ 6289.734561]  [<ffffffff81061e2d>] warn_slowpath_common+0x7d/0xa0
[ 6289.734568]  [<ffffffff81061e9c>] warn_slowpath_fmt+0x4c/0x50
[ 6289.734573]  [<ffffffff81061e55>] ? warn_slowpath_fmt+0x5/0x50
[ 6289.734578]  [<ffffffff8135f997>] ida_remove+0xb7/0xf0
[ 6289.734613]  [<ffffffff811aa91c>] free_anon_bdev+0x2c/0x50
[ 6289.734618]  [<ffffffff811aa959>] kill_anon_super+0x19/0x20
[ 6289.734676]  [<ffffffffa039d77a>] zpl_kill_sb+0x1a/0x20 [zfs]
[ 6289.734683]  [<ffffffff811aad3d>] deactivate_locked_super+0x3d/0x60
[ 6289.734687]  [<ffffffff811aada6>] deactivate_super+0x46/0x60
[ 6289.734728]  [<ffffffffa0365196>] zfs_sb_rele+0x26/0x50 [zfs]
[ 6289.734764]  [<ffffffffa036815d>] zfs_unmount_snap+0xdd/0x110 [zfs]
[ 6289.734799]  [<ffffffffa03682d0>] zfs_ioc_destroy_snaps+0x70/0xc0 [zfs]
[ 6289.734835]  [<ffffffffa036867c>] zfsdev_ioctl+0x1ac/0x4a0 [zfs]
[ 6289.734841]  [<ffffffff811ba555>] do_vfs_ioctl+0x2e5/0x4d0
[ 6289.734856]  [<ffffffff811ba275>] ? do_vfs_ioctl+0x5/0x4d0
[ 6289.734861]  [<ffffffff811ba7c1>] SyS_ioctl+0x81/0xa0
[ 6289.734895]  [<ffffffff816f9f1d>] system_call_fastpath+0x1a/0x1f
[ 6289.734900] ---[ end trace 92a360c255fa8c60 ]---
[ 6289.734960] general protection fault: 0000 [#1] SMP 
[ 6289.735108] Modules linked in: zfs(POF) zunicode(POF) zavl(POF) zcommon(POF) znvpair(POF) spl(OF) zlib_deflate aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd qxl psmouse ttm microcode virtio_balloon serio_raw virtio_console drm_kms_helper drm i2c_piix4 mac_hid lp parport ext2 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq virtio_scsi raid1 raid0 floppy multipath linear [last unloaded: spl]
[ 6289.735337] CPU: 1 PID: 27397 Comm: zfs Tainted: PF       W  O 3.11.0-26-generic #45-Ubuntu
[ 6289.735445] Hardware name: Red Hat RHEV Hypervisor, BIOS 0.5.1 01/01/2007
[ 6289.735463] task: ffff88000bcc9770 ti: ffff8800503da000 task.ti: ffff8800503da000
[ 6289.735478] RIP: 0010:[<ffffffff8114f7c8>]  [<ffffffff8114f7c8>] unregister_shrinker+0x28/0x60
[ 6289.735502] RSP: 0018:ffff8800503dbdb0  EFLAGS: 00010246
[ 6289.735513] RAX: dead000000200200 RBX: ffff880045e0abb0 RCX: ffff8800503dbfd8
[ 6289.735526] RDX: dead000000100100 RSI: 0000000000000000 RDI: ffffffff81c5a140
[ 6289.735540] RBP: ffff8800503dbdb8 R08: 000000002d2d2d5d R09: 0000000000000368
[ 6289.735554] R10: 3532633036336132 R11: 2030366338616635 R12: ffffffffa04162e0
[ 6289.735587] R13: 000000000000000a R14: 0000000000000b10 R15: ffff88005ef38d28
[ 6289.735602] FS:  00007f5d569d8b80(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
[ 6289.735623] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6289.735635] CR2: 0000000001d517e4 CR3: 0000000023a59000 CR4: 00000000000006e0
[ 6289.735660] Stack:
[ 6289.735667]  ffff880045e0a800 ffff8800503dbdd8 ffffffff811aad49 ffff880045e0a800
[ 6289.735710]  ffff880078ba9440 ffff8800503dbdf0 ffffffff811aada6 ffff88006ed70000
[ 6289.735734]  ffff8800503dbe08 ffffffffa0365196 ffff880070e5d820 ffff8800503dbe38
[ 6289.735761] Call Trace:
[ 6289.735773]  [<ffffffff811aad49>] deactivate_locked_super+0x49/0x60
[ 6289.735791]  [<ffffffff811aada6>] deactivate_super+0x46/0x60
[ 6289.735836]  [<ffffffffa0365196>] zfs_sb_rele+0x26/0x50 [zfs]
[ 6289.735893]  [<ffffffffa036815d>] zfs_unmount_snap+0xdd/0x110 [zfs]
[ 6289.735943]  [<ffffffffa03682d0>] zfs_ioc_destroy_snaps+0x70/0xc0 [zfs]
[ 6289.735995]  [<ffffffffa036867c>] zfsdev_ioctl+0x1ac/0x4a0 [zfs]
[ 6289.736018]  [<ffffffff811ba555>] do_vfs_ioctl+0x2e5/0x4d0
[ 6289.736048]  [<ffffffff811ba275>] ? do_vfs_ioctl+0x5/0x4d0
[ 6289.736064]  [<ffffffff811ba7c1>] SyS_ioctl+0x81/0xa0
[ 6289.736078]  [<ffffffff816f9f1d>] system_call_fastpath+0x1a/0x1f
[ 6289.736093] Code: 00 00 00 e8 2b a3 5a 00 55 48 89 e5 53 48 89 fb 48 c7 c7 40 a1 c5 81 e8 c7 ee 59 00 48 8b 43 20 48 8b 53 18 48 c7 c7 40 a1 c5 81 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00 ad de 48 89 43 18 
[ 6289.736287] RIP  [<ffffffff8114f7c8>] unregister_shrinker+0x28/0x60
[ 6289.736299]  RSP <ffff8800503dbdb0>
[ 6289.736797] ---[ end trace 92a360c255fa8c61 ]---

chrisrd commented 9 years ago

"Me too!". At least for the ida_remove called for id=XX which is not allocated part, and the 2nd call trace above.

http://buildbot.zfsonlinux.org/builders/ubuntu-13.04-amd64-builder/builds/2961/steps/shell_17/logs/stdio

Ignoring some hopefully unrelated debug logging per @3c54650, the trace looks like:

[ 5480.299582] ida_remove called for id=22 which is not allocated.
[ 5480.299598] general protection fault: 0000 [#1] SMP 
[ 5480.299672] Modules linked in: zfs(POF) zunicode(POF) zavl(POF) zcommon(POF) znvpair(POF) spl(OF) zlib_deflate(F) vesafb(F) aesni_intel(F) aes_x86_64(F) xts(F) lrw(F) gf128mul(F) ablk_helper(F) cryptd(F) microcode(F) psmouse(F) serio_raw(F) virtio_console i2c_piix4 virtio_balloon(F) mac_hid lp(F) parport(F) ext2(F) raid10(F) raid456(F) async_raid6_recov(F) async_memcpy(F) async_pq(F) async_xor(F) xor(F) async_tx(F) floppy(F) virtio_scsi raid6_pq(F) raid1(F) raid0(F) multipath(F) linear(F) [last unloaded: scsi_debug]
[ 5480.300058] CPU 2 
[ 5480.300086] Pid: 15978, comm: zfs Tainted: PF          O 3.8.0-35-generic #50-Ubuntu Red Hat RHEV Hypervisor
[ 5480.300161] RIP: 0010:[<ffffffff8113ee58>]  [<ffffffff8113ee58>] unregister_shrinker+0x28/0x60
[ 5480.300244] RSP: 0018:ffff880047739d90  EFLAGS: 00010246
[ 5480.300285] RAX: dead000000200200 RBX: ffff880003082798 RCX: 0000000000004e33
[ 5480.300332] RDX: dead000000100100 RSI: 0000000000000082 RDI: ffffffff81c45aa0
[ 5480.300379] RBP: ffff880047739d98 R08: 6f6d65725f616469 R09: 656c6c6163206576
[ 5480.300426] R10: 00000000000002b2 R11: 662064656c6c6163 R12: ffffffffa0383340
[ 5480.300473] R13: 000000000000000a R14: 0000000000000b10 R15: ffff88004baa1d18
[ 5480.300520] FS:  00007f6e92f26b80(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[ 5480.300586] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5480.300627] CR2: 00007f6d577f2060 CR3: 000000003432c000 CR4: 00000000000006e0
[ 5480.300679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5480.300729] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 5480.300777] Process zfs (pid: 15978, threadinfo ffff880047738000, task ffff880078a02e80)
[ 5480.302061] Stack:
[ 5480.302528]  ffff880003082400 ffff880047739db8 ffffffff81197123 ffff880003082400
[ 5480.302528]  ffff88003185cf80 ffff880047739dd0 ffffffff81197cd6 ffff88001c0bf000
[ 5480.302528]  ffff880047739de8 ffffffffa02d2896 ffff880050c0d1a0 ffff880047739e18
[ 5480.302528] Call Trace:
[ 5480.302528]  [<ffffffff81197123>] deactivate_locked_super+0x63/0x80
[ 5480.302528]  [<ffffffff81197cd6>] deactivate_super+0x46/0x60
[ 5480.302528]  [<ffffffffa02d2896>] zfs_sb_rele+0x26/0x50 [zfs]
[ 5480.302528]  [<ffffffffa02d608d>] zfs_unmount_snap+0xed/0x120 [zfs]
[ 5480.302528]  [<ffffffffa02d6220>] zfs_ioc_destroy_snaps+0x70/0xc0 [zfs]
[ 5480.302528]  [<ffffffffa02d65c4>] zfsdev_ioctl+0x184/0x5d0 [zfs]
[ 5480.302528]  [<ffffffff816d6340>] ? ftrace_call+0x5/0x2f
[ 5480.302528]  [<ffffffff811a6659>] do_vfs_ioctl+0x99/0x570
[ 5480.302528]  [<ffffffff811a6bc1>] sys_ioctl+0x91/0xb0
[ 5480.302528]  [<ffffffff816d671d>] system_call_fastpath+0x1a/0x1f
[ 5480.302528] Code: 0f 1f 00 e8 9b 74 59 00 55 48 89 e5 53 48 89 fb 48 c7 c7 a0 5a c4 81 e8 37 d2 58 00 48 8b 43 20 48 8b 53 18 48 c7 c7 a0 5a c4 81 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00 ad de 48 89 43 18 
[ 5480.302528] RIP  [<ffffffff8113ee58>] unregister_shrinker+0x28/0x60
[ 5480.302528]  RSP <ffff880047739d90>
[ 5480.316642] ---[ end trace 09b5fb5027971821 ]---
---------------------------------------------------------------------------------------------------------------

chrisrd commented 9 years ago

For what it's worth, I think what's happening is...

zfs_ioc_destroy_snaps() is iterating through the snaps nvlist over at least two entries, one time hitting the ida_remove problem, and a later time hitting the GPF:

static int
zfs_ioc_destroy_snaps(const char *poolname, nvlist_t *innvl, nvlist_t *outnvl)
{
        nvlist_t *snaps;
        nvpair_t *pair;
        boolean_t defer;

        if (nvlist_lookup_nvlist(innvl, "snaps", &snaps) != 0)
                return (SET_ERROR(EINVAL));
        defer = nvlist_exists(innvl, "defer");

        for (pair = nvlist_next_nvpair(snaps, NULL); pair != NULL;
            pair = nvlist_next_nvpair(snaps, pair)) {
                (void) zfs_unmount_snap(nvpair_name(pair));
                (void) zvol_remove_minor(nvpair_name(pair));
        }

        return (dsl_destroy_snapshots_nvl(snaps, defer, outnvl));
}

The ida_remove looks to be coming from:

const struct super_operations zpl_super_operations = {
        .kill_sb                = zpl_kill_sb,
};

zfs_sb_rele(zsb, FTAG);
  deactivate_super(zsb->z_sb)
    deactivate_locked_super(zsb->z_sb)
      fs->kill_sb(zsb->z_sb)  /* i.e. zpl_kill_sb */
        kill_anon_super(zsb->z_sb)
          free_anon_bdev(zsb->z_sb->s_dev)
            ida_remove(&unnamed_dev_ida, MINOR(zsb->z_sb->s_dev))
              "ida_remove called for id=24 which is not allocated."

Which would mean our MINOR(zsb->z_sb->s_dev) is bad, or has gone away from underneath us somehow.

For the GPF, in both cases the Code section is similar. The first one, i.e. @nedbass', disassembles to:

Dump of assembler code for function Code:
   0x0000000000601040 <+0>:     00 00   add    %al,(%rax)
   0x0000000000601042 <+2>:     00 e8   add    %ch,%al
   0x0000000000601044 <+4>:     2b a3 5a 00 55 48       sub    0x4855005a(%rbx),%esp
   0x000000000060104a <+10>:    89 e5   mov    %esp,%ebp
   0x000000000060104c <+12>:    53      push   %rbx
   0x000000000060104d <+13>:    48 89 fb        mov    %rdi,%rbx
   0x0000000000601050 <+16>:    48 c7 c7 40 a1 c5 81    mov    $0xffffffff81c5a140,%rdi
   0x0000000000601057 <+23>:    e8 c7 ee 59 00  callq  0xb9ff23
   0x000000000060105c <+28>:    48 8b 43 20     mov    0x20(%rbx),%rax
   0x0000000000601060 <+32>:    48 8b 53 18     mov    0x18(%rbx),%rdx
   0x0000000000601064 <+36>:    48 c7 c7 40 a1 c5 81    mov    $0xffffffff81c5a140,%rdi
   0x000000000060106b <+43>:    48 89 42 08     mov    %rax,0x8(%rdx)             <<< w/ RAX: dead000000200200 == Boom!                         
   0x000000000060106f <+47>:    48 89 10        mov    %rdx,(%rax)
   0x0000000000601072 <+50>:    48 b8 00 01 10 00 00 00 ad de   movabs $0xdead000000100100,%rax
   0x000000000060107c <+60>:    48 89 43 18     mov    %rax,0x18(%rbx)
   0x0000000000601080 <+64>:    00 00   add    %al,(%rax)

Where the <<< marks the actual faulting instruction. The RAX value of dead000000200200 is LIST_POISON2, and the 0xdead000000100100 just after the faulting instruction is LIST_POISON1, from linux/include/linux/poison.h. I.e. it's inlined, but the code comes from:

static inline void list_del(struct list_head *entry)
{
        __list_del(entry->prev, entry->next);
        entry->next = LIST_POISON1;
        entry->prev = LIST_POISON2;
}

This means the GPF is from:

deactivate_locked_super
  unregister_shrinker((&s->s_shrink);
    list_del(&shrinker->list);
      __list_del(entry->prev, entry->next);
        next->prev = prev;                      <<< next == dead000000200200 == Boom!
      entry->next = LIST_POISON1;

This means at this point the &shrinker->list had already been subjected to list_del(&shrinker->list);

Well so much for what I think is happening: unfortunately I've no idea _why_ it's happening, nor how to fix it.

nedbass commented 9 years ago

I would speculate that zfs destroy -r <dataset> is racing with zfs destroy <snapshot>. zfsstress destroys snapshots and recursively destroys datasets in an uncoordinated fashion. That could explain how the device goes away from underneath us.

behlendorf commented 7 years ago

Closing as stale, I haven't observed this during the automated test in a long time.

openzfs / zfs

zfsstress: ida_remove called for id=24 which is not allocated. #3165