Closed nedbass closed 7 years ago
"Me too!". At least for the ida_remove called for id=XX which is not allocated
part, and the 2nd call trace above.
Ignoring some hopefully unrelated debug logging per @3c54650, the trace looks like:
[ 5480.299582] ida_remove called for id=22 which is not allocated.
[ 5480.299598] general protection fault: 0000 [#1] SMP
[ 5480.299672] Modules linked in: zfs(POF) zunicode(POF) zavl(POF) zcommon(POF) znvpair(POF) spl(OF) zlib_deflate(F) vesafb(F) aesni_intel(F) aes_x86_64(F) xts(F) lrw(F) gf128mul(F) ablk_helper(F) cryptd(F) microcode(F) psmouse(F) serio_raw(F) virtio_console i2c_piix4 virtio_balloon(F) mac_hid lp(F) parport(F) ext2(F) raid10(F) raid456(F) async_raid6_recov(F) async_memcpy(F) async_pq(F) async_xor(F) xor(F) async_tx(F) floppy(F) virtio_scsi raid6_pq(F) raid1(F) raid0(F) multipath(F) linear(F) [last unloaded: scsi_debug]
[ 5480.300058] CPU 2
[ 5480.300086] Pid: 15978, comm: zfs Tainted: PF O 3.8.0-35-generic #50-Ubuntu Red Hat RHEV Hypervisor
[ 5480.300161] RIP: 0010:[<ffffffff8113ee58>] [<ffffffff8113ee58>] unregister_shrinker+0x28/0x60
[ 5480.300244] RSP: 0018:ffff880047739d90 EFLAGS: 00010246
[ 5480.300285] RAX: dead000000200200 RBX: ffff880003082798 RCX: 0000000000004e33
[ 5480.300332] RDX: dead000000100100 RSI: 0000000000000082 RDI: ffffffff81c45aa0
[ 5480.300379] RBP: ffff880047739d98 R08: 6f6d65725f616469 R09: 656c6c6163206576
[ 5480.300426] R10: 00000000000002b2 R11: 662064656c6c6163 R12: ffffffffa0383340
[ 5480.300473] R13: 000000000000000a R14: 0000000000000b10 R15: ffff88004baa1d18
[ 5480.300520] FS: 00007f6e92f26b80(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[ 5480.300586] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5480.300627] CR2: 00007f6d577f2060 CR3: 000000003432c000 CR4: 00000000000006e0
[ 5480.300679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5480.300729] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 5480.300777] Process zfs (pid: 15978, threadinfo ffff880047738000, task ffff880078a02e80)
[ 5480.302061] Stack:
[ 5480.302528] ffff880003082400 ffff880047739db8 ffffffff81197123 ffff880003082400
[ 5480.302528] ffff88003185cf80 ffff880047739dd0 ffffffff81197cd6 ffff88001c0bf000
[ 5480.302528] ffff880047739de8 ffffffffa02d2896 ffff880050c0d1a0 ffff880047739e18
[ 5480.302528] Call Trace:
[ 5480.302528] [<ffffffff81197123>] deactivate_locked_super+0x63/0x80
[ 5480.302528] [<ffffffff81197cd6>] deactivate_super+0x46/0x60
[ 5480.302528] [<ffffffffa02d2896>] zfs_sb_rele+0x26/0x50 [zfs]
[ 5480.302528] [<ffffffffa02d608d>] zfs_unmount_snap+0xed/0x120 [zfs]
[ 5480.302528] [<ffffffffa02d6220>] zfs_ioc_destroy_snaps+0x70/0xc0 [zfs]
[ 5480.302528] [<ffffffffa02d65c4>] zfsdev_ioctl+0x184/0x5d0 [zfs]
[ 5480.302528] [<ffffffff816d6340>] ? ftrace_call+0x5/0x2f
[ 5480.302528] [<ffffffff811a6659>] do_vfs_ioctl+0x99/0x570
[ 5480.302528] [<ffffffff811a6bc1>] sys_ioctl+0x91/0xb0
[ 5480.302528] [<ffffffff816d671d>] system_call_fastpath+0x1a/0x1f
[ 5480.302528] Code: 0f 1f 00 e8 9b 74 59 00 55 48 89 e5 53 48 89 fb 48 c7 c7 a0 5a c4 81 e8 37 d2 58 00 48 8b 43 20 48 8b 53 18 48 c7 c7 a0 5a c4 81 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00 ad de 48 89 43 18
[ 5480.302528] RIP [<ffffffff8113ee58>] unregister_shrinker+0x28/0x60
[ 5480.302528] RSP <ffff880047739d90>
[ 5480.316642] ---[ end trace 09b5fb5027971821 ]---
---------------------------------------------------------------------------------------------------------------
For what it's worth, I think what's happening is...
zfs_ioc_destroy_snaps()
is iterating through the snaps
nvlist over at least two entries, one time hitting the ida_remove
problem, and a later time hitting the GPF:
static int
zfs_ioc_destroy_snaps(const char *poolname, nvlist_t *innvl, nvlist_t *outnvl)
{
nvlist_t *snaps;
nvpair_t *pair;
boolean_t defer;
if (nvlist_lookup_nvlist(innvl, "snaps", &snaps) != 0)
return (SET_ERROR(EINVAL));
defer = nvlist_exists(innvl, "defer");
for (pair = nvlist_next_nvpair(snaps, NULL); pair != NULL;
pair = nvlist_next_nvpair(snaps, pair)) {
(void) zfs_unmount_snap(nvpair_name(pair));
(void) zvol_remove_minor(nvpair_name(pair));
}
return (dsl_destroy_snapshots_nvl(snaps, defer, outnvl));
}
The ida_remove
looks to be coming from:
const struct super_operations zpl_super_operations = {
.kill_sb = zpl_kill_sb,
};
zfs_sb_rele(zsb, FTAG);
deactivate_super(zsb->z_sb)
deactivate_locked_super(zsb->z_sb)
fs->kill_sb(zsb->z_sb) /* i.e. zpl_kill_sb */
kill_anon_super(zsb->z_sb)
free_anon_bdev(zsb->z_sb->s_dev)
ida_remove(&unnamed_dev_ida, MINOR(zsb->z_sb->s_dev))
"ida_remove called for id=24 which is not allocated."
Which would mean our MINOR(zsb->z_sb->s_dev)
is bad, or has gone away from underneath us somehow.
For the GPF, in both cases the Code section is similar. The first one, i.e. @nedbass', disassembles to:
Dump of assembler code for function Code:
0x0000000000601040 <+0>: 00 00 add %al,(%rax)
0x0000000000601042 <+2>: 00 e8 add %ch,%al
0x0000000000601044 <+4>: 2b a3 5a 00 55 48 sub 0x4855005a(%rbx),%esp
0x000000000060104a <+10>: 89 e5 mov %esp,%ebp
0x000000000060104c <+12>: 53 push %rbx
0x000000000060104d <+13>: 48 89 fb mov %rdi,%rbx
0x0000000000601050 <+16>: 48 c7 c7 40 a1 c5 81 mov $0xffffffff81c5a140,%rdi
0x0000000000601057 <+23>: e8 c7 ee 59 00 callq 0xb9ff23
0x000000000060105c <+28>: 48 8b 43 20 mov 0x20(%rbx),%rax
0x0000000000601060 <+32>: 48 8b 53 18 mov 0x18(%rbx),%rdx
0x0000000000601064 <+36>: 48 c7 c7 40 a1 c5 81 mov $0xffffffff81c5a140,%rdi
0x000000000060106b <+43>: 48 89 42 08 mov %rax,0x8(%rdx) <<< w/ RAX: dead000000200200 == Boom!
0x000000000060106f <+47>: 48 89 10 mov %rdx,(%rax)
0x0000000000601072 <+50>: 48 b8 00 01 10 00 00 00 ad de movabs $0xdead000000100100,%rax
0x000000000060107c <+60>: 48 89 43 18 mov %rax,0x18(%rbx)
0x0000000000601080 <+64>: 00 00 add %al,(%rax)
Where the <<<
marks the actual faulting instruction. The RAX value of dead000000200200 is LIST_POISON2
, and the 0xdead000000100100 just after the faulting instruction is LIST_POISON1
, from linux/include/linux/poison.h
. I.e. it's inlined, but the code comes from:
static inline void list_del(struct list_head *entry)
{
__list_del(entry->prev, entry->next);
entry->next = LIST_POISON1;
entry->prev = LIST_POISON2;
}
This means the GPF is from:
deactivate_locked_super
unregister_shrinker((&s->s_shrink);
list_del(&shrinker->list);
__list_del(entry->prev, entry->next);
next->prev = prev; <<< next == dead000000200200 == Boom!
entry->next = LIST_POISON1;
This means at this point the &shrinker->list
had already been subjected to list_del(&shrinker->list);
Well so much for what I think is happening: unfortunately I've no idea _why_ it's happening, nor how to fix it.
I would speculate that zfs destroy -r <dataset>
is racing with zfs destroy <snapshot>
. zfsstress
destroys snapshots and recursively destroys datasets in an uncoordinated fashion. That could explain how the device goes away from underneath us.
Closing as stale, I haven't observed this during the automated test in a long time.
The buildbot hit this WARNING and GPF while destroying a snapshot during the
zfsstress
test.