Closed zfsfabien closed 12 years ago
Another crash but much quicker this time. Using rsync over nfs again. Compress=on and dedup=on
ifconfig eth0 eth0 Link encap:Ethernet HWaddr inet6 addr: fe80::6e62:6dff:fedc:ca64/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:146181441 errors:0 dropped:0 overruns:0 frame:0 TX packets:138138341 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:120975123804 (120.9 GB) TX bytes:193813038437 (193.8 GB) Interrupt:40 Base address:0x8000
[ 60.903051] SPL: Loaded module v0.6.0.31, using hostid 0x007f0101
[ 60.905405] zunicode: module license 'CDDL' taints kernel.
[ 60.905413] Disabling lock debugging due to kernel taint
[ 68.809078] ZFS: Loaded module v0.6.0.31, ZFS pool version 28, ZFS filesystem version 5
[ 102.361292] nfsd: last server has exited, flushing export cache
[ 103.574043] svc: failed to register lockdv1 RPC service (errno 97).
[ 103.574220] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 103.574295] NFSD: starting 90-second grace period
[ 678.336007] nfsd: last server has exited, flushing export cache
[ 679.491685] svc: failed to register lockdv1 RPC service (errno 97).
[ 679.491854] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 679.491927] NFSD: starting 90-second grace period
[ 679.603501] nfsd: last server has exited, flushing export cache
[ 680.819051] svc: failed to register lockdv1 RPC service (errno 97).
[ 680.819220] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 680.819291] NFSD: starting 90-second grace period
[ 3315.289411] nfsd: non-standard errno: -75
[ 3316.047270] nfsd: non-standard errno: -75
[ 3316.706336] nfsd: non-standard errno: -75
[ 3317.166289] nfsd: non-standard errno: -75
[ 3317.726667] nfsd: non-standard errno: -75
[14804.390094] usb 3-2: new high speed USB device number 2 using ehci_hcd
[14811.507858] usb 3-2: USB disconnect, device number 2
[44725.710001] INFO: rcu_sched_state detected stall on CPU 0 (t=6000 jiffies)
[44882.020104] INFO: task txg_sync:1685 blocked for more than 120 seconds.
[44882.020247] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[44882.020392] txg_sync D ffff8801cbb44890 0 1685 2 0x00000000
[44882.020408] ffff8801c13a1b10 0000000000000046 ffff8801c13a1b6c ffff8801e7b12a80
[44882.020421] 0000000000012a80 ffff8801c13a1fd8 ffff8801c13a0010 0000000000012a80
[44882.020434] ffff8801c13a1fd8 0000000000012a80 ffff8801cbb544d0 ffff8801cbb444d0
[44882.020447] Call Trace:
[44882.020468] [
top - 06:48:16 up 16:53, 2 users, load average: 28.00, 28.01, 27.98 Tasks: 255 total, 19 running, 236 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 50.0%sy, 0.0%ni, 0.0%id, 49.8%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 6379384k total, 4383480k used, 1995904k free, 79964k buffers Swap: 376828k total, 0k used, 376828k free, 106292k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3068 root 20 0 19404 1432 948 R 1 0.0 0:00.08 top
3 root 20 0 0 0 0 R 0 0.0 2:13.76 ksoftirqd/0
4 root 20 0 0 0 0 R 0 0.0 26:15.00 kworker/0:0
895 root 20 0 0 0 0 D 0 0.0 0:00.03 jbd2/dm-2-8
1529 root 39 19 0 0 0 R 0 0.0 0:44.77 z_null_int/0
1542 root 39 19 0 0 0 R 0 0.0 0:29.95 z_wr_iss_h/0
1544 root 39 19 0 0 0 R 0 0.0 0:30.26 z_wr_iss_h/2
1546 root 39 19 0 0 0 R 0 0.0 0:30.43 z_wr_iss_h/4
1547 root 39 19 0 0 0 R 0 0.0 4:27.04 z_wr_int/0
1549 root 39 19 0 0 0 R 0 0.0 4:26.92 z_wr_int/2
1551 root 39 19 0 0 0 R 0 0.0 4:32.37 z_wr_int/4
1553 root 39 19 0 0 0 R 0 0.0 4:41.01 z_wr_int/6
1555 root 39 19 0 0 0 R 0 0.0 4:20.73 z_wr_int/8
1557 root 39 19 0 0 0 R 0 0.0 4:28.06 z_wr_int/10
1559 root 39 19 0 0 0 R 0 0.0 4:27.25 z_wr_int/12
1561 root 39 19 0 0 0 R 0 0.0 4:32.24 z_wr_int/14
1563 root 39 19 0 0 0 R 0 0.0 0:21.83 z_wr_int_h/0
1565 root 39 19 0 0 0 R 0 0.0 0:21.80 z_wr_int_h/2
1567 root 39 19 0 0 0 R 0 0.0 0:22.45 z_wr_int_h/4
1684 root 0 -20 0 0 0 R 0 0.0 0:26.25 txg_quiesce
1685 root 0 -20 0 0 0 D 0 0.0 10:00.99 txg_sync
1884 root 20 0 0 0 0 D 0 0.0 12:12.77 nfsd
1885 root 20 0 0 0 0 D 0 0.0 12:13.04 nfsd
1886 root 20 0 0 0 0 D 0 0.0 12:09.94 nfsd
1887 root 20 0 0 0 0 D 0 0.0 12:24.22 nfsd
1888 root 20 0 0 0 0 D 0 0.0 13:25.27 nfsd
1889 root 20 0 0 0 0 D 0 0.0 12:47.74 nfsd
1890 root 20 0 0 0 0 D 0 0.0 12:25.28 nfsd
Linux nfs 3.0.0-8-server #11~lucid1-Ubuntu SMP Wed Aug 17 10:27:02 UTC 2011 x86_64 GNU/Linux
root@nfs:/store/memotech/data# free -m total used free shared buffers cached Mem: 6229 4281 1948 0 78 103 -/+ buffers/cache: 4099 2130 Swap: 367 0 367
root@nfs:/store/memotech/data# free -m total used free shared buffers cached Mem: 6229 4281 1948 0 78 103 -/+ buffers/cache: 4099 2130 Swap: 367 0 367
data# zpool iostat -v 3 capacity operations bandwidth pool alloc free read write read write
store 3.81T 1.63T 60 338 4.70M 4.31M raidz1 3.81T 1.63T 60 338 4.70M 4.31M sda - - 30 98 1.57M 2.18M sdb - - 30 109 1.57M 2.18M sdd - - 30 98 1.56M 2.18M cache - - - - - - sdc3 31.6G 16.8G 8 5 120K 545K
capacity operations bandwidth
pool alloc free read write read write
store 3.81T 1.63T 0 0 0 0 raidz1 3.81T 1.63T 0 0 0 0 sda - - 0 0 0 0 sdb - - 0 0 0 0 sdd - - 0 0 0 0 cache - - - - - - sdc3 31.6G 16.8G 0 0 0 0
Thanks for the bug report, we'll look in to it.
Here is an update: If the rsync speed is limited to 3MB/s the bug is not triggered after 7 days of rsync over NFS and about 1TB written and 1.5TB read of mostly small files < 5MB.
I guess the low power of the cpu and using a large L2ARC ssd help to trigger that "feature" when rsync speed is not limited.
I was able to reproduce the crash without NFS:
[86910.260005] INFO: rcu_sched_state detected stall on CPU 0 (t=6000 jiffies)
[139364.538026] BUG: unable to handle kernel NULL pointer dereference at (null)
[139364.538129] IP: [
This is likely a duplicate of issue #287
I have tested on better hardware: 6 CPU and 16GB RAM, no L2ARC. 1 mirror pool, 1 raidz pool. ZFS speed is good, the GBps network is the bottleneck. The uptime is very bad, with 2 or 3 nfs feed to 2 ZFS pools it crashes in less than 1 hour:
3.0.0-12-server #20-Ubuntu SMP Fri Oct 7 16:36:30 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
spl-dkms 0.6.0.34
[ 3099.032006] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
[ 3240.964066] INFO: task fsnotify_mark:68 blocked for more than 120 seconds.
[ 3240.964075] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3240.964081] fsnotify_mark D ffffffff81805120 0 68 2 0x00000000
[ 3240.964094] ffff8804124a7c70 0000000000000046 0000000000000000 0000000000000000
[ 3240.964104] ffff8804124a7fd8 ffff8804124a7fd8 ffff8804124a7fd8 0000000000012a40
[ 3240.964114] ffff880417182e40 ffff88041254ae40 0000000000000000 7fffffffffffffff
[ 3240.964123] Call Trace:
[ 3240.964138] [
Let me know how I can help.
Does your system still work after this happens (except for the balance_pgdat panic of course)? If it does it would be interesting to see the output of "zpool events -v" once that zio_wait deadlock has occured.
Closing issue, this should be fixed by the following two commits for #287.
zfsonlinux/zfs@6a95d0b74c2951f0dc82361ea279f64a7349f060 zfsonlinux/spl@b8b6e4c453929596b630fa1cca1ee26a532a2ab4
I have been doing many tests with ubuntu LTS and kernels from backport.
ii zfs 0.6.0.5-0ubuntu3~maverick1 Native ZFS filesystem utilities for Linux ii zfs-dkms 0.6.0.30-0ubuntu2~maverick1 Native ZFS filesystem kernel modules for Linux ii zfs-lib 0.6.0.5-0ubuntu3~maverick1 Native ZFS filesystem libraries for Linux
ZFS is exported with NFS and a large data 1400GB set is rsync to it. I am using a RAIDZ with 3 disks. L2ARC is enabled with 48GB of SSD. RAM is 6GB. CPU is an AMD E350 dual core.
All traffic is from NFS R&W.
root@nfs:~# ifconfig eth0 eth0 Link encap:Ethernet HWaddr inet6 addr: fe80::6e62:6dff:fedc:ca64/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:847730939 errors:0 dropped:0 overruns:0 frame:0 TX packets:823901907 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:681564940769 (681.5 GB) TX bytes:1152252590165 (1.1 TB)
root@nfs:~# uname -a Linux nfs 3.0.0-8-server #11~lucid1-Ubuntu SMP Wed Aug 17 10:27:02 UTC 2011 x86_64 GNU/Linux
[45270.219666] nfsd: non-standard errno: -75 [146363.704947] nfs ffff8801cee45d10 0000000000000046 ffff8801cee45cc0 0000000300000001 [328814.290381] 0000000000012a80 ffff8801cee45fd8 ffff8801cee44010 0000000000012a80 [328814.290394] ffff8801cee45fd8 0000000000012a80 ffff8801d9feade0 ffff8801d9c996f0 [328814.290406] Call Trace: [328814.290429] [] ? prepare_to_wait_exclusive+0x60/0x90
[328814.290466] [] cv_wait_common+0x78/0xe0 [spl]
[328814.290478] [] ? wake_up_bit+0x40/0x40
[328814.290504] [] cv_wait+0x13/0x20 [spl]
[328814.290612] [] zio_wait+0xeb/0x160 [zfs]
[328814.290666] [] l2arc_feed_thread+0x64d/0x870 [zfs]
[328814.290723] [] ? arc_release_bp+0x20/0x20 [zfs]
[328814.290745] [] ? thread_create+0x160/0x160 [spl]
[328814.290767] [] thread_generic_wrapper+0x78/0x90 [spl]
[328814.290778] [] kthread+0x96/0xa0
[328814.290791] [] kernel_thread_helper+0x4/0x10
[328814.290803] [] ? kthread_worker_fn+0x190/0x190
[328814.290813] [] ? gs_change+0x13/0x13
[328814.290846] INFO: task txg_sync:1693 blocked for more than 120 seconds.
[328814.290970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[328814.291115] txg_sync D ffff8801ccd8b1a0 0 1693 2 0x00000000
[328814.291127] ffff8801c2c03bf0 0000000000000046 ffff8801cccd5c08 ffff8801e7b12af0
[328814.291140] 0000000000012a80 ffff8801c2c03fd8 ffff8801c2c02010 0000000000012a80
[328814.291152] ffff8801c2c03fd8 0000000000012a80 ffff8801d9feade0 ffff8801ccd8ade0
[328814.291164] Call Trace:
[328814.291176] [] mutex_lock_slowpath+0xdf/0x160
[328814.291189] [] mutex_lock+0x23/0x40
[328814.291213] [] cv_wait_common+0x80/0xe0 [spl]
[328814.291224] [] ? wake_up_bit+0x40/0x40
[328814.291247] [] __cv_wait+0x13/0x20 [spl]
[328814.291331] [] zio_wait+0xeb/0x160 [zfs]
[328814.291409] [] spa_sync+0x3db/0x9a0 [zfs]
[328814.291421] [] ? autoremove_wake_function+0x16/0x40
[328814.291433] [] ? wake_up+0x53/0x70
[328814.291513] [] txg_sync_thread+0x225/0x3b0 [zfs]
[328814.291525] [] ? kfree+0x100/0x130
[328814.291605] [] ? txg_thread_exit+0x40/0x40 [zfs]
[328814.291627] [] ? thread_create+0x160/0x160 [spl]
[328814.291649] [] thread_generic_wrapper+0x78/0x90 [spl]
[328814.291660] [] kthread+0x96/0xa0
[328814.291671] [] kernel_thread_helper+0x4/0x10
[328814.291682] [] ? kthread_worker_fn+0x190/0x190
[328814.291692] [] ? gs_change+0x13/0x13
[328814.291704] INFO: task nfsd:4568 blocked for more than 120 seconds.
[328814.291822] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[328814.291967] nfsd D ffff8801da64c890 0 4568 2 0x00000000
[328814.291979] ffff880127df3780 0000000000000046 ffff880127df3760 ffffffff8105b470
[328814.291991] 0000000000012a80 ffff880127df3fd8 ffff880127df2010 0000000000012a80
[328814.292002] ffff880127df3fd8 0000000000012a80 ffff8801d98a16f0 ffff8801da64c4d0
[328814.292014] Call Trace:
[328814.292026] [] ? try_to_wake_up+0x230/0x2b0
[328814.292036] [] ? prepare_to_wait_exclusive+0x60/0x90
[328814.292061] [] cv_wait_common+0x78/0xe0 [spl]
[328814.292071] [] ? wake_up_bit+0x40/0x40
[328814.292094] [] __cv_wait+0x13/0x20 [spl]
[328814.292174] [] txg_wait_open+0x7b/0xa0 [zfs]
[328814.292239] [] dmu_tx_wait+0xed/0xf0 [zfs]
[328814.292320] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.292386] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.292397] [] ? _raw_spin_lock+0xe/0x20
[328814.292409] [<ffff ? prepare_to_wait_exclusive+0x60/0x90
[328814.293714] [] cv_wait_common+0x78/0xe0 [spl]
[328814.293724] [] ? wake_up_bit+0x40/0x40
[328814.293747] [] cv_wait+0x13/0x20 [spl]
[328814.293828] [] txg_wait_open+0x7b/0xa0 [zfs]
[328814.293892] [] dmu_tx_wait+0xed/0xf0 [zfs]
[328814.293972] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.294037] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.294049] [] ? _raw_spin_lock+0xe/0x20
[328814.294060] [] ? iput+0x2c/0x50
[328814.294072] [] ? find_acceptable_alias+0x2a/0x130
[328814.294152] [] zpl_write_common+0x52/0x70 [zfs]
[328814.294231] [] zpl_write+0x68/0xa0 [zfs]
[328814.294242] [] ? kmalloc+0xe0/0x150
[328814.294320] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.294332] [] do_loop_readv_writev+0x59/0x90
[328814.294343] [] do_readv_writev+0x1ce/0x1e0
[328814.294417] [] ? rrw_exit+0x3e/0x140 [zfs]
[328814.294497] [] ? zfs_open+0x9f/0x140 [zfs]
[328814.294575] [] ? zpl_open+0x71/0x90 [zfs]
[328814.294653] [] ? zpl_release+0x70/0x70 [zfs]
[328814.294666] [] vfs_writev+0x48/0x60
[328814.294687] [] nfsd_vfs_write+0x100/0x3b0 [nfsd]
[328814.294699] [] ? dentry_open+0x3b/0x50
[328814.294720] [] ? nfsd_open+0x10e/0x1a0 [nfsd]
[328814.294743] [] nfsd_write+0xe7/0x100 [nfsd]
[328814.294766] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd]
[328814.294790] [] nfsd3_proc_write+0xaf/0x140 [nfsd]
[328814.294810] [] nfsd_dispatch+0xfe/0x240 [nfsd]
[328814.294844] [] svc_process_common+0x344/0x640 [sunrpc]
[328814.294858] [] ? try_to_wake_up+0x2b0/0x2b0
[328814.294877] [] ? nfsd_svc+0x120/0x120 [nfsd]
[328814.294909] [] svc_process+0x110/0x160 [sunrpc]
[328814.294928] [] nfsd+0xc5/0x170 [nfsd]
[328814.294939] [] kthread+0x96/0xa0
[328814.294950] [] kernel_thread_helper+0x4/0x10
[328814.294960] [] ? kthread_worker_fn+0x190/0x190
[328814.294971] [] ? gs_change+0x13/0x13
[328814.294979] INFO: task nfsd:4570 blocked for more than 120 seconds.
[328814.295096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[328814.295241] nfsd D ffff8801ccd89ab0 0 4570 2 0x00000000
[328814.295252] ffff8801260a5780 0000000000000046 ffff8801260a5700 ffffffff00000000
[328814.295264] 0000000000012a80 ffff8801260a5fd8 ffff8801260a4010 0000000000012a80
[328814.295275] ffff8801260a5fd8 0000000000012a80 ffff8801dd720000 ffff8801ccd896f0
[328814.295287] Call Trace:
[328814.295311] [] cv_wait_common+0x78/0xe0 [spl]
[328814.295322] [] ? wake_up_bit+0x40/0x40
[328814.295345] [] __cv_wait+0x13/0x20 [spl]
[328814.295425] [] txg_wait_open+0x7b/0xa0 [zfs]
[328814.295490] [] dmu_tx_wait+0xed/0xf0 [zfs]
[328814.295570] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.295634] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.295646] [] ? _raw_spin_lock+0xe/0x20
[328814.295657] [] ? iput+0x2c/0x50
[328814.295669] [] ? find_acceptable_alias+0x2a/0x130
[328814.295748] [] zpl_write_common+0x52/0x70 [zfs]
[328814.295826] [] zpl_write+0x68/0xa0 [zfs]
[328814.295837] [] ? kmalloc+0xe0/0x150
[328814.295914] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.295927] [] do_loop_readv_writev+0x59/0x90
[328814.295938] [] do_readv_writev+0x1ce/0x1e0
[3a057eccd>] dmu_tx_wait+0xed/0xf0 [zfs]
[328814.297182] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.297247] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.297258] [] ? _raw_spin_lock+0xe/0x20
[328814.297269] [] ? iput+0x2c/0x50
[328814.297281] [] ? find_acceptable_alias+0x2a/0x130
[328814.297360] [] zpl_write_common+0x52/0x70 [zfs]
[328814.297440] [] zpl_write+0x68/0xa0 [zfs]
[328814.297451] [] ? kmalloc+0x39/0x150
[328814.297528] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.297540] [] do_loop_readv_writev+0x59/0x90
[328814.297552] [] do_readv_writev+0x1ce/0x1e0
[328814.297626] [] ? rrw_exit+0x3e/0x140 [zfs]
[328814.297705] [] ? zfs_open+0x9f/0x140 [zfs]
[328814.297783] [] ? zpl_open+0x71/0x90 [zfs]
[328814.297860] [] ? zpl_release+0x70/0x70 [zfs]
[328814.297873] [] vfs_writev+0x48/0x60
[328814.297894] [] nfsd_vfs_write+0x100/0x3b0 [nfsd]
[328814.297906] [] ? dentry_open+0x3b/0x50
[328814.297927] [] ? nfsd_open+0x10e/0x1a0 [nfsd]
[328814.297949] [] nfsd_write+0xe7/0x100 [nfsd]
[328814.297972] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd]
[328814.297996] [] nfsd3_proc_write+0xaf/0x140 [nfsd]
[328814.298016] [] nfsd_dispatch+0xfe/0x240 [nfsd]
[328814.298049] [] svc_process_common+0x344/0x640 [sunrpc]
[328814.298063] [] ? try_to_wake_up+0x2b0/0x2b0
[328814.298082] [] ? nfsd_svc+0x120/0x120 [nfsd]
[328814.298114] [] svc_process+0x110/0x160 [sunrpc]
[328814.298133] [] nfsd+0xc5/0x170 [nfsd]
[328814.298144] [] kthread+0x96/0xa0
[328814.298155] [] kernel_thread_helper+0x4/0x10
[328814.298165] [] ? kthread_worker_fn+0x190/0x190
[328814.298176] [] ? gs_change+0x13/0x13
[328814.298184] INFO: task nfsd:4572 blocked for more than 120 seconds.
[328814.298301] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[328814.298446] nfsd D ffff8801ccd8df80 0 4572 2 0x00000000
[328814.298458] ffff880119edb780 0000000000000046 ffff880119edb700 ffff8801c2c01d90
[328814.298469] 0000000000012a80 ffff880119edbfd8 ffff880119eda010 0000000000012a80
[328814.298480] ffff880119edbfd8 0000000000012a80 ffff8801ccd8c4d0 ffff8801ccd8dbc0
[328814.298492] Call Trace:
[328814.298502] [] ? prepare_to_wait_exclusive+0x60/0x90
[328814.298527] [] cv_wait_common+0x78/0xe0 [spl]
[328814.298537] [] ? wake_up_bit+0x40/0x40
[328814.298559] [] cv_wait+0x13/0x20 [spl]
[328814.298639] [] txg_wait_open+0x7b/0xa0 [zfs]
[328814.298703] [] dmu_tx_wait+0xed/0xf0 [zfs]
[328814.298783] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.298798] [] ? find_acceptable_alias+0x2a/0x130
[328814.298877] [] zpl_write_common+0x52/0x70 [zfs]
[328814.298956] [] zpl_write+0x68/0xa0 [zfs]
[328814.298966] [] ? kmalloc+0x39/0x150
[328814.299044] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.299056] [] do_loop_readv_writev+0x59/0x90
[328814.299067] [] do_readv_writev+0x1ce/0x1e0
[328814.299142] [] ? rrw_exit+0x3e/0x140 [zfs]
[328814.299221] [] ? zfs_open+0x9f/0x140 [zfs]
[328814.299300] [] ? zpl_open+0x71/0x90 [zfs]
[328814.299377] [] ? zpl_release+0x70/0x70 [zfs]
[328814.299390] [] vfs_writev+0x48/0x60
[328814.299411] [] nfsd_vfs_write+0x100/0x3b0 [nfsd]
[328814.299423] [] ? dentrt+0xed/0xf0 [zfs]
[328814.300352] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.300417] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.300429] [] ? _raw_spin_lock+0xe/0x20
[328814.300439] [] ? iput+0x2c/0x50
[328814.300452] [] ? find_acceptable_alias+0x2a/0x130
[328814.300531] [] zpl_write_common+0x52/0x70 [zfs]
[328814.300610] [] zpl_write+0x68/0xa0 [zfs]
[328814.300621] [] ? kmalloc+0xe0/0x150
[328814.300699] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.300711] [] do_loop_readv_writev+0x59/0x90
[328814.300722] [] do_readv_writev+0x1ce/0x1e0
[328814.300796] [] ? rrw_exit+0x3e/0x140 [zfs]
[328814.300875] [] ? zfs_open+0x9f/0x140 [zfs]
[328814.300953] [] ? zpl_open+0x71/0x90 [zfs]
[328814.301031] [] ? zpl_release+0x70/0x70 [zfs]
[328814.301043] [] vfs_writev+0x48/0x60
[328814.301064] [] nfsd_vfs_write+0x100/0x3b0 [nfsd]
[328814.301076] [] ? dentry_open+0x3b/0x50
[328814.301096] [] ? nfsd_open+0x10e/0x1a0 [nfsd]
[328814.301119] [] nfsd_write+0xe7/0x100 [nfsd]
[328814.301142] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd]
[328814.301166] [] nfsd3_proc_write+0xaf/0x140 [nfsd]
[328814.301186] [] nfsd_dispatch+0xfe/0x240 [nfsd]
[328814.301219] [] svc_process_common+0x344/0x640 [sunrpc]
[328814.301232] [] ? try_to_wake_up+0x2b0/0x2b0
[328814.301251] [] ? nfsd_svc+0x120/0x120 [nfsd]
[328814.301283] [] svc_process+0x110/0x160 [sunrpc]
[328814.301303] [] nfsd+0xc5/0x170 [nfsd]
[328814.301313] [] kthread+0x96/0xa0
[328814.301324] [] kernel_thread_helper+0x4/0x10
[328814.301334] [] ? kthread_worker_fn+0x190/0x190
[328814.301344] [] ? gs_change+0x13/0x13
[328814.301353] INFO: task nfsd:4574 blocked for more than 120 seconds.
[328814.301475] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[328814.301620] nfsd D ffff8801d98a4890 0 4574 2 0x00000000
[328814.301631] ffff880172409780 0000000000000046 ffff880172409760 ffffffff8105b470
[328814.301643] 0000000000012a80 ffff880172409fd8 ffff880172408010 0000000000012a80
[328814.301655] ffff880172409fd8 0000000000012a80 ffff8801da64ade0 ffff8801d98a44d0
[328814.301666] Call Trace:
[328814.301677] [] ? try_to_wake_up+0x230/0x2b0
[328814.301688] [] ? prepare_to_wait_exclusive+0x60/0x90
[328814.301713] [] cv_wait_common+0x78/0xe0 [spl]
[328814.301723] [] ? wake_up_bit+0x40/0x40
[328814.301745] [] cv_wait+0x13/0x20 [spl]
[328814.301825] [] txg_wait_open+0x7b/0xa0 [zfs]
[328814.301889] [] dmu_tx_wait+0xed/0xf0 [zfs]
[328814.301970] [] zfs_write+0x3b6/0xc90 [zfs]
[328814.302035] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.302046] [] ? _raw_spin_lock+0xe/0x20
[328814.302057] [] ? iput+0x2c/0x50
[328814.302069] [] ? find_acceptable_alias+0x2a/0x130
[328814.302149] [] zpl_write_common+0x52/0x70 [zfs]
[328814.302228] [] zpl_write+0x68/0xa0 [zfs]
[328814.302239] [] ? __kmalloc+0x39/0x150
[328814.302316] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.302328] [] do_loop_readv_writev+0x59/0x90
[328814.302340] [] do_readv_writev+0x1ce/0x1e0
[328814.302414] [] ? rrw_exit+0x3e/0x140 [zfs]
[328814.302493] [] ? zfs_open+0x9f/0x140 [zfs]
[328814.302570] [] ? zpl_open+0x71/0x90 [zfs]
[328814.3026/0xc90 [zfs]
[328814.315141] [] ? dnode_rele+0x54/0x90 [zfs]
[328814.315153] [] ? _raw_spin_lock+0xe/0x20
[328814.315164] [] ? iput+0x2c/0x50
[328814.315177] [] ? find_acceptable_alias+0x2a/0x130
[328814.315256] [] zpl_write_common+0x52/0x70 [zfs]
[328814.315335] [] zpl_write+0x68/0xa0 [zfs]
[328814.315346] [] ? kmalloc+0xe0/0x150
[328814.315424] [] ? zpl_write_common+0x70/0x70 [zfs]
[328814.315436] [] do_loop_readv_writev+0x59/0x90
[328814.315448] [] do_readv_writev+0x1ce/0x1e0
[328814.315522] [] ? rrw_exit+0x3e/0x140 [zfs]
[328814.315602] [] ? zfs_open+0x9f/0x140 [zfs]
[328814.315679] [] ? zpl_open+0x71/0x90 [zfs]
[328814.315757] [] ? zpl_release+0x70/0x70 [zfs]
[328814.315769] [] vfs_writev+0x48/0x60
[328814.315792] [] nfsd_vfs_write+0x100/0x3b0 [nfsd]
[328814.315803] [] ? dentry_open+0x3b/0x50
[328814.315824] [] ? nfsd_open+0x10e/0x1a0 [nfsd]
[328814.315848] [] nfsd_write+0xe7/0x100 [nfsd]
[328814.315870] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd]
[328814.315895] [] nfsd3_proc_write+0xaf/0x140 [nfsd]
[328814.315915] [] nfsd_dispatch+0xfe/0x240 [nfsd]
[328814.315949] [] svc_process_common+0x344/0x640 [sunrpc]
[328814.315964] [] ? try_to_wake_up+0x2b0/0x2b0
[328814.315982] [] ? nfsd_svc+0x120/0x120 [nfsd]
[328814.316014] [] svc_process+0x110/0x160 [sunrpc]
[328814.316034] [] nfsd+0xc5/0x170 [nfsd]
[328814.316044] [] kthread+0x96/0xa0
[328814.316055] [] kernel_thread_helper+0x4/0x10
[328814.316065] [] ? kthread_worker_fn+0x190/0x190
[328814.316076] [] ? gs_change+0x13/0x13
[328869.430002] INFO: rcu_sched_state detected stall on CPU 0 (t=24030 jiffies)
[329049.730002] INFO: rcu_sched_state detected stall on CPU 0 (t=42060 jiffies)
[329230.030002] INFO: rcu_sched_state detected stall on CPU 0 (t=60090 jiffies)
[329410.330002] INFO: rcu_sched_state detected stall on CPU 0 (t=78120 jiffies)
[329590.630002] INFO: rcu_sched_state detected stall on CPU 0 (t=96150 jiffies)
[329770.930002] INFO: rcu_sched_state detected stall on CPU 0 (t=114180 jiffies)
[329951.230002] INFO: rcu_sched_state detected stall on CPU 0 (t=132210 jiffies)
[330131.530001] INFO: rcu_sched_state detected stall on CPU 0 (t=150240 jiffies)
[330311.830002] INFO: rcu_sched_state detected stall on CPU 0 (t=168270 jiffies
After reboot: zpool iostat -v 3 capacity operations bandwidth pool alloc free read write read write
store 3.66T 1.77T 171 29 11.6M 510K raidz1 3.66T 1.77T 171 29 11.6M 510K sda - - 103 9 3.90M 257K sdb - - 98 9 3.89M 257K sdd - - 104 9 3.90M 257K cache - - - - - - sdc3 4.90G 43.5G 1 16 141K 1.86M