siebenmann commented 9 years ago

I have a reliably reproducible situation where unmounting some of my ZFS filesystems will hang in the unmount process, although others can successfully be unmounted. This manifests both during system shutdown/reboot and if I just do the unmounts manually as root. SysRq diagnostics say that umount is active on a CPU, apparently hanging inside zfs_sb_teardown():

Jun 18 18:52:40 hawkwind kernel: [<ffffffffa032160b>] taskq_wait_outstanding+0x5b/0x100 [spl]
Jun 18 18:52:40 hawkwind kernel: [<ffffffffa046d6f0>] zfs_sb_teardown+0x60/0x3c0 [zfs]
Jun 18 18:52:40 hawkwind kernel: [<ffffffffa046daa6>] zfs_umount+0x36/0x100 [zfs]
Jun 18 18:52:40 hawkwind kernel: [<ffffffffa048d02c>] zpl_put_super+0x2c/0x40 [zfs]
Jun 18 18:52:40 hawkwind kernel: [<ffffffff8121df26>] generic_shutdown_super+0x76/0x100
Jun 18 18:52:40 hawkwind kernel: [<ffffffff8121e236>] kill_anon_super+0x16/0x30
Jun 18 18:52:40 hawkwind kernel: [<ffffffffa048d19e>] zpl_kill_sb+0x1e/0x30 [zfs]
Jun 18 18:52:40 hawkwind kernel: [<ffffffff8121e629>] deactivate_locked_super+0x49/0x80
Jun 18 18:52:40 hawkwind kernel: [<ffffffff8121ea9c>] deactivate_super+0x6c/0x80
Jun 18 18:52:40 hawkwind kernel: [<ffffffff8123cc23>] cleanup_mnt+0x43/0xa0
Jun 18 18:52:40 hawkwind kernel: [<ffffffff8123ccd2>] __cleanup_mnt+0x12/0x20
Jun 18 18:52:40 hawkwind kernel: [<ffffffff810ba624>] task_work_run+0xd4/0xf0
Jun 18 18:52:40 hawkwind kernel: [<ffffffff81014cdd>] do_notify_resume+0x8d/0xa0
Jun 18 18:52:40 hawkwind kernel: [<ffffffff817831a3>] int_signal+0x12/0x17

Once this happens to umount, commands like sync also start hanging.

I have at least partial traces of all tasks active at the point that umount hung, and there is nothing else with a kernel stack that runs through the ZFS or SPL modules. The next time around I will try to get a task dump from the very end of shutdown, when there should be almost nothing left and I can probably capture a full trace of all (few) remaining tasks.

There's only one call to taskq_wait_outstanding() in zfs_sb_teardown(), so theoretically the only way I can see for this to happen is if zsb->z_nr_znodes somehow never goes to zero. And actually I think there may be a code bug here. As far as I can see, there is nothing that forces gcc to reload the value of zsb->z_nr_znodes in the while() loop condition check, so the compiler is perfectly entitled to load it into a register and let it sit. Then if you enter with zsb->z_nr_znodes > 0 and you are unmounting, you will never notice that the in-memory version has gone to zero; you will instead loop forever. Actually my idea here in the struck-through text can't be correct, because as far as I can see gcc has to assume the function calls made in the loop can change zsb->z_nr_znodes (even though they don't do it directly). Sorry about that.

(Unfortunately I don't know enough about disassembling and mapping compiled kernel modules, so I can't look at my compiled zfs.ko to see if this is what the compiler has done.)

My machine is 64-bit Fedora 21 with gcc-4.9.2-6.fc21.x86_64.

Based on reading the kernel's documentation on memory barriers, this seems like a classical example that needs a compiler barrier. I can't tell if it needs the general barrier() or if ACCESS_ONCE() is sufficient. I'm probably going to patch a barrier() into my copy of zfs_sb_teardown() and build the result to see if this fixes things for me, but I can't do that immediately.

chrisrd commented 9 years ago

@siebenmann You should be able to use gdb zfs.ko then disass /m zfs_sb_teardown to get the source lines with disassembled instructions.

Would you be able to put a debug patch into your system? See: chrisrd/zfs@1647956

What version of ZoL are you running?

siebenmann commented 9 years ago

I'm running git tip, currently on commit 57ae840077e6c061bc57fb4860491f50b604d360; however this has happened on multiple versions. I am working slowly on bisecting it, but that will probably take a good chunk more of time. I will put your debug patch on my system on Monday and see if I can get it to trigger.

(I'm going to modify it slightly so that it breaks out of the loop if rounds gets sufficiently large, because otherwise I predict that my logs will get flooded with that message. I assume that this is relatively harmless, since it's more or less reverting to the pre commit f0da4d15082be83547e5370e4479d3c04b56451d state and that state was working for me.)

chrisrd commented 9 years ago

I'd try replacing the taskq_wait_outstanding(..., 0) with taskq_wait(...), i.e. undoing the change done in @c5528b9.

Breaking out of the loop after a sufficiently large number of rounds might work, but it also might mean you run into #3281, the problem @f0da4d1 was designed to fix. Note also that "sufficiently large" is actually pretty low: per comments in #3296, I can't see why it requires more than 2 or maybe 3 rounds - although it's definitely been observed to do 4 rounds in some cases.

siebenmann commented 9 years ago

I've now had this reproduce, and under a situation where I know what filesystem was involved so I can identify the specific file:

hawkwind kernel: zfs_sb_teardown: unmounting=1, round 10, z_nr_znodes=1
hawkwind kernel: zfs_sb_teardown: z_all_znodes contains i_ino=1018, i_mode=0100755, i_state=0x0, z_xattr_parent=0

Inode 1018 on this particular filesystem is an executable, but not one that was running at the time or before; it was one that ran several times at startup and wound up exec()'ing another process each time. (Specifically it's djb's 'envdir' program.)

The kernel this happened on is Fedora 21's latest, 4.0.5-200.fc21.x86_64.

behlendorf commented 9 years ago

@siebenmann @chrisrd I've seen this a couple of times myself in the latest master source. The ARC lock contention changes 06358ea16e2ff256acc6376c71539f7211851404 appear to have made it more likely but I don't believe they introduced the issue. I've been able to reproduce it fairly reliably by running a dbench 64 workload.

Unfortunately, I haven't yet had a chance to investigate but this will be a blocker for us. So the sooner we get to the root cause. The two most likely causes for this would be:

A dentry or xattr child inode hasn't been destroyed for some reason and is still holding a reference.
A code path was introduced which is missing an iput() so the reference is never dropped under some specific circumstance.

siebenmann commented 9 years ago

It may be significant that the filesystem I most frequently see this on is one where I have subdirectories that are bind-mounted in two other places (and the executable whose inode was orphaned was run through one of those bind mounts). Unfortunately I can't remember which order I unmounted things in this time around.

I have seen this for filesystems in my pool that are not bind mounted, though. Just less frequently, at least when I've been testing things by hand.

siebenmann commented 9 years ago

I captured another issue, this time with two inodes. Each of them was an inode for something that exec'd other programs; one an executable (a different one) and one a shell script that ended in 'exec ...'. Is it at all possible that there is an inode leak in ZFS in this code path, or is it theoretically entirely in the VFS and so should be just like other ones?

chrisrd commented 9 years ago

Apologies - I've been intending to look further into this, particularly trying the dbench 64 to see if I can reproduce the problem locally. Unfortunately it looks like I'll be unable to free up time in the next week or so :-(

Bronek commented 9 years ago

@siebenmann try #3718

behlendorf commented 9 years ago

@Bronek I don't think #3718 will help with this but it is plausible. We'll have to see. But for whatever reason this issue appears to have gone in to hiding again. I'm no longer seeing it regularly in the automated testing so I'm removing it as a blocker for 0.6.5.

snajpa commented 8 years ago

We're hitting this one hard and can reproduce it easily within a few tries for umount (albeit at the cost of downtime of production hw).

Can we work together to hunt this one down please?

behlendorf commented 8 years ago

@snajpa I'd love to see this one resolved, do you have a reproducer which doesn't impact your production hardware?

ppokorny commented 8 years ago

We have a system that seems to hit this frequently as well. Hang in umount of a second filesystem (not the root fs). Tried to get a crash dump this last time.

We've tried upgrading through multiple versions of ZoL. Currently running:

rpm -qa zfs*

zfs-0.6.5.4-1.el6.x86_64 zfs-dkms-0.6.5.4-1.el6.noarch zfs-release-1-4.el6.noarch

rpm -qa spl*

spl-dkms-0.6.5.4-1.el6.noarch spl-0.6.5.4-1.el6.x86_64

on

cat /etc/*release

CentOS release 6.6 (Final)

uname -a

Linux iop11-phys 2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

The servers are an HA pair and this happens when we try and do a failover

Apr 8 11:22:09 iop11-phys kernel: Call Trace: Apr 8 11:22:09 iop11-phys kernel: [] ? taskq_wait_outstanding_check+0x3c/0x50 [spl] Apr 8 11:22:09 iop11-phys kernel: [] ? taskq_wait_outstanding+0x2f/0xc0 [spl] Apr 8 11:22:09 iop11-phys kernel: [] ? dsl_pool_iput_taskq+0x9/0x10 [zfs] Apr 8 11:22:09 iop11-phys kernel: [] ? zfs_sb_teardown+0x5c/0x350 [zfs] Apr 8 11:22:09 iop11-phys kernel: [] ? zfs_umount+0x43/0x110 [zfs] Apr 8 11:22:09 iop11-phys kernel: [] ? zpl_put_super+0x36/0x50 [zfs] Apr 8 11:22:09 iop11-phys kernel: [] ? generic_shutdown_super+0x5b/0xe0 Apr 8 11:22:09 iop11-phys kernel: [] ? kill_anon_super+0x16/0x60 Apr 8 11:22:09 iop11-phys kernel: [] ? zpl_kill_sb+0x1e/0x30 [zfs] Apr 8 11:22:09 iop11-phys kernel: [] ? deactivate_super+0x57/0x80 Apr 8 11:22:09 iop11-phys kernel: [] ? mntput_no_expire+0xbf/0x110 Apr 8 11:22:09 iop11-phys kernel: [] ? sys_umount+0x7b/0x3a0 Apr 8 11:22:09 iop11-phys kernel: [] ? system_call_fastpath+0x16/0x1b

siebenmann commented 8 years ago

For the record: I've continued to see this fairly regularly on the development version (git tip or close to it, since I don't update and reboot my machine every day/changeset). I don't have a clear reproduction but it's reasonably common on reboot.

(I haven't been pushing on this issue because I basically only unmount ZFS filesystems on shutdown and there the hack patch of just giving up eventually deals with the problem. But this is obviously not a good solution for people who need to unmount filesystems while the system is running.)

behlendorf commented 8 years ago

This on is definitely on the short list to resolve before the test tag. The good new is that several tests in the ZFS Test Suite are able to hit it fairly easily. So it's really just needs someone with some free time to dig in to it.

snajpa commented 8 years ago

Here are some stack traces that I've caught recently since we've deployed ABRT.

:[6031506.076417] Pid: 307941, comm: umount veid: 0 Tainted: P           --L------------    2.6.32-042stab113.11 #1 042stab113_11 Su
permicro X9DRFR/X9DRFR
:[6031506.076422] RIP: 0010:[<ffffffff81548fdd>]  [<ffffffff81548fdd>] _spin_unlock_irqrestore+0x1d/0x20
:[6031506.076431] RSP: 0018:ffff8809be4e7d20  EFLAGS: 00000202
:[6031506.076433] RAX: 0000000000000202 RBX: ffff8809be4e7d98 RCX: 000000000000119b
:[6031506.076435] RDX: ffff881074885d80 RSI: 0000000000000202 RDI: 0000000000000202
:[6031506.076437] RBP: ffffffff8100bcce R08: ffff881074885e60 R09: 00157004f211c2f1
:[6031506.076440] R10: 001570078c5df561 R11: 0000000000000001 R12: ffff8809be4e7d18
:[6031506.076442] R13: ffffffff8100bcce R14: ffff880069c0f830 R15: 00156d7888eea11c
:[6031506.076444] FS:  00007fb235a50740(0000) GS:ffff880069ec0000(0000) knlGS:0000000000000000
:[6031506.076447] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
:[6031506.076449] CR2: 00007f811159055b CR3: 000000089eb4e000 CR4: 00000000000427e0
:[6031506.076456] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
:[6031506.076458] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
:[6031506.076461] Process umount (pid: 307941, veid: 0, threadinfo ffff8809be4e4000, task ffff880eb35822c0)
:[6031506.076487]  [<ffffffffa021fc73>] ? taskq_wait_outstanding+0x43/0xf0 [spl]
:[6031506.076540]  [<ffffffffa031518c>] ? zfs_sb_teardown+0x5c/0x2c0 [zfs]
:[6031506.076572]  [<ffffffffa0315683>] ? zfs_umount+0x43/0x110 [zfs]
:[6031506.076604]  [<ffffffffa0335586>] ? zpl_put_super+0x36/0x50 [zfs]
:[6031506.076612]  [<ffffffff811bb37b>] ? generic_shutdown_super+0x7b/0x100
:[6031506.076617]  [<ffffffff811bb490>] ? kill_anon_super+0x40/0x80
:[6031506.076646]  [<ffffffffa0334eee>] ? zpl_kill_sb+0x1e/0x30 [zfs]
:[6031506.076649]  [<ffffffff811bb959>] ? deactivate_super+0x79/0xa0
:[6031506.076654]  [<ffffffff811dd64f>] ? mntput_no_expire+0xbf/0x110
:[6031506.076658]  [<ffffffff811de2d2>] ? sys_umount+0x82/0x3d0
:[6031506.076662]  [<ffffffff8100b1a2>] ? system_call_fastpath+0x16/0x1b
:[6031506.076664] Code: e8 89 5d d6 ff c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 89 fa 66 ff 02 66 66 90 48 89 f7 57 
9d 0f 1f 44 00 00 c9 <c3> 66 90 55 48 89 e5 0f 1f 44 00 00 f0 ff 07 48 89 f7 57 9d 0f 
:[6031506.076687] Call Trace:
:[6031506.076696]  [<ffffffffa021fc73>] ? taskq_wait_outstanding+0x43/0xf0 [spl]
:[6031506.076727]  [<ffffffffa031518c>] ? zfs_sb_teardown+0x5c/0x2c0 [zfs]
:[6031506.076756]  [<ffffffffa0315683>] ? zfs_umount+0x43/0x110 [zfs]
:[6031506.076786]  [<ffffffffa0335586>] ? zpl_put_super+0x36/0x50 [zfs]
:[6031506.076791]  [<ffffffff811bb37b>] ? generic_shutdown_super+0x7b/0x100
:[6031506.076797]  [<ffffffff811bb490>] ? kill_anon_super+0x40/0x80
:[6031506.076824]  [<ffffffffa0334eee>] ? zpl_kill_sb+0x1e/0x30 [zfs]
:[6031506.076828]  [<ffffffff811bb959>] ? deactivate_super+0x79/0xa0
:[6031506.076831]  [<ffffffff811dd64f>] ? mntput_no_expire+0xbf/0x110
:[6031506.076835]  [<ffffffff811de2d2>] ? sys_umount+0x82/0x3d0
:[6031506.076838]  [<ffffffff8100b1a2>] ? system_call_fastpath+0x16/0x1b
:[6031506.076840] Tainting kernel with flag 0xe
:[6031506.076843] Pid: 307941, comm: umount veid: 0 Tainted: P           --L------------    2.6.32-042stab113.11 #1
:[6031506.076845] Call Trace:
:[6031506.076846]  <IRQ>  [<ffffffff8107de51>] ? add_taint+0x71/0x80
:[6031506.076854]  [<ffffffff811052d2>] ? watchdog_timer_fn+0x1f2/0x230
:[6031506.076857]  [<ffffffff811050e0>] ? watchdog_timer_fn+0x0/0x230
:[6031506.076861]  [<ffffffff810afc2e>] ? __run_hrtimer+0x8e/0x1d0
:[6031506.076864]  [<ffffffff8108756d>] ? __do_softirq+0x18d/0x250
:[6031506.076868]  [<ffffffff810affea>] ? hrtimer_interrupt+0xea/0x260
:[6031506.076873]  [<ffffffff81036f7d>] ? local_apic_timer_interrupt+0x3d/0x70
:[6031506.076877]  [<ffffffff8103c589>] ? native_apic_msr_eoi_write+0x19/0x20
:[6031506.076882]  [<ffffffff8154fd95>] ? smp_apic_timer_interrupt+0x45/0x60
:[6031506.076885]  [<ffffffff8100bcd3>] ? apic_timer_interrupt+0x13/0x20
:[6031506.076887]  <EOI>  [<ffffffff81548fdd>] ? _spin_unlock_irqrestore+0x1d/0x20
:[6031506.076898]  [<ffffffffa021fc73>] ? taskq_wait_outstanding+0x43/0xf0 [spl]
:[6031506.076927]  [<ffffffffa031518c>] ? zfs_sb_teardown+0x5c/0x2c0 [zfs]
:[6031506.076956]  [<ffffffffa0315683>] ? zfs_umount+0x43/0x110 [zfs]
:[6031506.076984]  [<ffffffffa0335586>] ? zpl_put_super+0x36/0x50 [zfs]
:[6031506.076990]  [<ffffffff811bb37b>] ? generic_shutdown_super+0x7b/0x100
:[6031506.076995]  [<ffffffff811bb490>] ? kill_anon_super+0x40/0x80
:[6031506.077022]  [<ffffffffa0334eee>] ? zpl_kill_sb+0x1e/0x30 [zfs]
:[6031506.077026]  [<ffffffff811bb959>] ? deactivate_super+0x79/0xa0
:[6031506.077029]  [<ffffffff811dd64f>] ? mntput_no_expire+0xbf/0x110
:[6031506.077033]  [<ffffffff811de2d2>] ? sys_umount+0x82/0x3d0
:[6031506.077036]  [<ffffffff8100b1a2>] ? system_call_fastpath+0x16/0x1b

snajpa commented 8 years ago

Another ABRT report:

abrt_version:   2.0.8
cmdline:        ro root=UUID=b94e64aa-9f57-4148-a1a6-990a5f182444 rd_MD_UUID=3c632323:459d9f60:18d40dd3:c57d7a67 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 crashkernel=161M@0M  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
hostname:       node3.brq.vpsfree.cz
kernel:         3.697010]
last_occurrence: 1468356826
not-reportable: A kernel problem occurred, but your kernel has been tainted (flags:P           ). Kernel maintainers are unable to diagnose tainted reports.
time:           Tue 12 Jul 2016 10:53:46 PM CEST
uid:            0

backtrace:
:[7719433.697010] BUG: soft lockup - CPU#29 stuck for 67s! [umount:39013]
:[7719433.697182] Modules linked in: xt_statistic xt_AUDIT xt_NFLOG nfnetlink_log ipt_ULOG xt_time xt_realm xt_NFQUEUE xt_TPROXY nf_tproxy_core xt_CLASSIFY xt_CONNMARK xt_hashlimit xt_connmark xt_owner xt_iprange xt_physdev xt_connlimit xt_set ip_set nfnetlink xt_CHECKSUM ipmi_devintf xt_policy drbg ansi_cprng ctr twofish_x86_64 twofish_common camellia serpent blowfish cast5 des_generic cbc crypto_null aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 aes_generic xcbc rmd160 sha512_generic xt_mark xt_recent xt_comment xt_MARK xt_pkttype ipt_addrtype sch_sfq cls_u32 sch_htb ipt_MASQUERADE iptable_raw vzethdev pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vziolimit vzdquota ip_gre ip_tunnel kvm_intel kvm veth bridge xfrm6_mode_tunnel xfrm4_mode_tunnel esp6 esp4 af_key ip6table_mangle ip6t_REJECT vzrst vzcpt nfs fscache ip6_queue nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_raw ip6t_ah ip6t_frag ip6t_hbh ip6t_ipv6header ip6t_LOG ip6t_
 rt ip6_
:tables ipt_REDIRECT nf_nat_irc nf_nat_ftp iptable_nat nf_nat xt_helper xt_state xt_conntrack nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 xt_length ipt_LOG xt_hl xt_tcpmss xt_TCPMSS ipt_REJECT xt_DSCP xt_dscp xt_multiport xt_limit iptable_mangle iptable_filter ip_tables fuse tun arc4 ecb ppp_mppe ppp_deflate ppp_async ppp_generic slhc crc_ccitt vzevent nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf vznetdev vzmon vzdev 8021q garp stp llc bonding ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) microcode joydev sg iTCO_wdt iTCO_vendor_support sb_edac edac_core i2c_i801 lpc_ich mfd_core igb i2c_algo_bit i2c_core ptp pps_core power_meter acpi_ipmi ipmi_si ipmi_msghandler ioatdma dca shpchp ext4 jbd2 mbcache raid1 sd_mod crc_t10dif xhci_hcd ahci wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
:[7719433.697344] CPU 29 
:[7719433.697345] Modules linked in: xt_statistic xt_AUDIT xt_NFLOG nfnetlink_log ipt_ULOG xt_time xt_realm xt_NFQUEUE xt_TPROXY nf_tproxy_core xt_CLASSIFY xt_CONNMARK xt_hashlimit xt_connmark xt_owner xt_iprange xt_physdev xt_connlimit xt_set ip_set nfnetlink xt_CHECKSUM ipmi_devintf xt_policy drbg ansi_cprng ctr twofish_x86_64 twofish_common camellia serpent blowfish cast5 des_generic cbc crypto_null aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 aes_generic xcbc rmd160 sha512_generic xt_mark xt_recent xt_comment xt_MARK xt_pkttype ipt_addrtype sch_sfq cls_u32 sch_htb ipt_MASQUERADE iptable_raw vzethdev pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vziolimit vzdquota ip_gre ip_tunnel kvm_intel kvm veth bridge xfrm6_mode_tunnel xfrm4_mode_tunnel esp6 esp4 af_key ip6table_mangle ip6t_REJECT vzrst vzcpt nfs fscache ip6_queue nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_raw ip6t_ah ip6t_frag ip6t_hbh ip6t_ipv6header ip6t_LOG ip6t_
 rt ip6_
:tables ipt_REDIRECT nf_nat_irc nf_nat_ftp iptable_nat nf_nat xt_helper xt_state xt_conntrack nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 xt_length ipt_LOG xt_hl xt_tcpmss xt_TCPMSS ipt_REJECT xt_DSCP xt_dscp xt_multiport xt_limit iptable_mangle iptable_filter ip_tables fuse tun arc4 ecb ppp_mppe ppp_deflate ppp_async ppp_generic slhc crc_ccitt vzevent nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf vznetdev vzmon vzdev 8021q garp stp llc bonding ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) microcode joydev sg iTCO_wdt iTCO_vendor_support sb_edac edac_core i2c_i801 lpc_ich mfd_core igb i2c_algo_bit i2c_core ptp pps_core power_meter acpi_ipmi ipmi_si ipmi_msghandler ioatdma dca shpchp ext4 jbd2 mbcache raid1 sd_mod crc_t10dif xhci_hcd ahci wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
:[7719433.697420] 
:[7719433.697423] Pid: 39013, comm: umount veid: 0 Tainted: P           --L------------    2.6.32-042stab113.11 #1 042stab113_11 Supermicro X10DRi/X10DRi
:[7719433.697426] RIP: 0010:[<ffffffffa0247d17>]  [<ffffffffa0247d17>] taskq_wait_outstanding+0xe7/0xf0 [spl]
:[7719433.697438] RSP: 0018:ffff882f65befda0  EFLAGS: 00000292
:[7719433.697440] RAX: 0000000000000206 RBX: ffff882f65befdd8 RCX: 000000000000697a
:[7719433.697442] RDX: ffff884071cf28c0 RSI: 0000000000000206 RDI: 0000000000000206
:[7719433.697443] RBP: ffffffff8100bcce R08: ffff884071cf29a0 R09: 00000000000010ed
:[7719433.697445] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000206
:[7719433.697447] R13: ffffffff8100bf6e R14: ffff882f65befd18 R15: 0000000000000206
:[7719433.697449] FS:  00007fad47015740(0000) GS:ffff882100f40000(0000) knlGS:0000000000000000
:[7719433.697451] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
:[7719433.697452] CR2: 00007ffec8ba3e70 CR3: 0000003b9d3be000 CR4: 00000000001427e0
:[7719433.697454] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
:[7719433.697456] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
:[7719433.697458] Process umount (pid: 39013, veid: 0, threadinfo ffff882f65bec000, task ffff883dcdfba640)
:[7719433.697460] Stack:
:[7719433.697461]  ffffffffa033d18c ffff882f65befdb8 ffff8808eb2eb000 ffff88124d41d400
:[7719433.697463] <d> ffffffff81c1da40 ffff88124d41d400 0000000000000000 ffff882f65befe08
:[7719433.697466] <d> ffffffffa033d683 ffff883dcdfba640 0000000000000000 ffffffff81c1da40
:[7719433.697469] Call Trace:
:[7719433.697514]  [<ffffffffa033d18c>] ? zfs_sb_teardown+0x5c/0x2c0 [zfs]
:[7719433.697540]  [<ffffffffa033d683>] ? zfs_umount+0x43/0x110 [zfs]
:[7719433.697566]  [<ffffffffa035d586>] ? zpl_put_super+0x36/0x50 [zfs]
:[7719433.697574]  [<ffffffff811bb37b>] ? generic_shutdown_super+0x7b/0x100
:[7719433.697579]  [<ffffffff811bb490>] ? kill_anon_super+0x40/0x80
:[7719433.697602]  [<ffffffffa035ceee>] ? zpl_kill_sb+0x1e/0x30 [zfs]
:[7719433.697605]  [<ffffffff811bb959>] ? deactivate_super+0x79/0xa0
:[7719433.697608]  [<ffffffff811dd64f>] ? mntput_no_expire+0xbf/0x110
:[7719433.697611]  [<ffffffff811de2d2>] ? sys_umount+0x82/0x3d0
:[7719433.697615]  [<ffffffff8100b1a2>] ? system_call_fastpath+0x16/0x1b
:[7719433.697616] Code: 48 89 df 48 89 55 98 e8 c8 12 30 e1 4c 3b 7d 98 73 ba 4c 89 ee 4c 89 f7 e8 97 37 e6 e0 48 83 c4 48 5b 41 5c 41 5d 41 5e 41 5f c9 <c3> 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 0f 
:[7719433.697636] Call Trace:
:[7719433.697660]  [<ffffffffa033d18c>] ? zfs_sb_teardown+0x5c/0x2c0 [zfs]
:[7719433.697684]  [<ffffffffa033d683>] ? zfs_umount+0x43/0x110 [zfs]
:[7719433.697707]  [<ffffffffa035d586>] ? zpl_put_super+0x36/0x50 [zfs]
:[7719433.697711]  [<ffffffff811bb37b>] ? generic_shutdown_super+0x7b/0x100
:[7719433.697716]  [<ffffffff811bb490>] ? kill_anon_super+0x40/0x80
:[7719433.697738]  [<ffffffffa035ceee>] ? zpl_kill_sb+0x1e/0x30 [zfs]
:[7719433.697741]  [<ffffffff811bb959>] ? deactivate_super+0x79/0xa0
:[7719433.697744]  [<ffffffff811dd64f>] ? mntput_no_expire+0xbf/0x110
:[7719433.697747]  [<ffffffff811de2d2>] ? sys_umount+0x82/0x3d0
:[7719433.697749]  [<ffffffff8100b1a2>] ? system_call_fastpath+0x16/0x1b
:[7719433.697751] Tainting kernel with flag 0xe
:[7719433.697753] Pid: 39013, comm: umount veid: 0 Tainted: P           --L------------    2.6.32-042stab113.11 #1
:[7719433.697755] Call Trace:
:[7719433.697756]  <IRQ>  [<ffffffff8107de51>] ? add_taint+0x71/0x80
:[7719433.697763]  [<ffffffff811052d2>] ? watchdog_timer_fn+0x1f2/0x230
:[7719433.697765]  [<ffffffff811050e0>] ? watchdog_timer_fn+0x0/0x230
:[7719433.697769]  [<ffffffff810afc2e>] ? __run_hrtimer+0x8e/0x1d0
:[7719433.697772]  [<ffffffff8108756d>] ? __do_softirq+0x18d/0x250
:[7719433.697774]  [<ffffffff810affea>] ? hrtimer_interrupt+0xea/0x260
:[7719433.697778]  [<ffffffff81036f7d>] ? local_apic_timer_interrupt+0x3d/0x70
:[7719433.697783]  [<ffffffff8154fd95>] ? smp_apic_timer_interrupt+0x45/0x60
:[7719433.697785]  [<ffffffff8100bcd3>] ? apic_timer_interrupt+0x13/0x20
:[7719433.697787]  <EOI>  [<ffffffffa0247d17>] ? taskq_wait_outstanding+0xe7/0xf0 [spl]
:[7719433.697838]  [<ffffffffa033d18c>] ? zfs_sb_teardown+0x5c/0x2c0 [zfs]
:[7719433.697860]  [<ffffffffa033d683>] ? zfs_umount+0x43/0x110 [zfs]
:[7719433.697881]  [<ffffffffa035d586>] ? zpl_put_super+0x36/0x50 [zfs]
:[7719433.697886]  [<ffffffff811bb37b>] ? generic_shutdown_super+0x7b/0x100
:[7719433.697890]  [<ffffffff811bb490>] ? kill_anon_super+0x40/0x80
:[7719433.697910]  [<ffffffffa035ceee>] ? zpl_kill_sb+0x1e/0x30 [zfs]
:[7719433.697913]  [<ffffffff811bb959>] ? deactivate_super+0x79/0xa0
:[7719433.697915]  [<ffffffff811dd64f>] ? mntput_no_expire+0xbf/0x110
:[7719433.697918]  [<ffffffff811de2d2>] ? sys_umount+0x82/0x3d0
:[7719433.697920]  [<ffffffff8100b1a2>] ? system_call_fastpath+0x16/0x1b
:[7719517.762565] BUG: soft lockup - CPU#29 stuck for 67s! [umount:39013]
:[7719517.762749] Modules linked in: xt_statistic xt_AUDIT xt_NFLOG nfnetlink_log ipt_ULOG xt_time xt_realm xt_NFQUEUE xt_TPROXY nf_tproxy_core xt_CLASSIFY xt_CONNMARK xt_hashlimit xt_connmark xt_owner xt_iprange xt_physdev xt_connlimit xt_set ip_set nfnetlink xt_CHECKSUM ipmi_devintf xt_policy drbg ansi_cprng ctr twofish_x86_64 twofish_common camellia serpent blowfish cast5 des_generic cbc crypto_null aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 aes_generic xcbc rmd160 sha512_generic xt_mark xt_recent xt_comment xt_MARK xt_pkttype ipt_addrtype sch_sfq cls_u32 sch_htb ipt_MASQUERADE iptable_raw vzethdev pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vziolimit vzdquota ip_gre ip_tunnel kvm_intel kvm veth bridge xfrm6_mode_tunnel xfrm4_mode_tunnel esp6 esp4 af_key ip6table_mangle ip6t_REJECT vzrst vzcpt nfs fscache ip6_queue nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_raw ip6t_ah ip6t_frag ip6t_hbh ip6t_ipv6header ip6t_LOG ip6t_
 rt ip6_

behlendorf commented 8 years ago

This should now be resolved in master by the following commits. We'll back these changes to 0.6.5.8 once we've verified not additional improvements are needed.

061460d Fix get_zfs_sb race with concurrent umount 31b6111 Kill zp->z_xattr_parent to prevent pinning ddae16a xattr dir doesn't get purged during iput

behlendorf commented 8 years ago

Closing. The patches listed above have been applied to 0.6.5.8.

openzfs / zfs

Unmounting ZFS filesystems frequently hangs for me in git tip and recent git versions #3508

rpm -qa zfs*

rpm -qa spl*

cat /etc/*release

uname -a