openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.48k stars 1.74k forks source link

Soft lock, duplicate #287 #394

Closed zfsfabien closed 12 years ago

zfsfabien commented 13 years ago

I have been doing many tests with ubuntu LTS and kernels from backport.

ii zfs 0.6.0.5-0ubuntu3~maverick1 Native ZFS filesystem utilities for Linux ii zfs-dkms 0.6.0.30-0ubuntu2~maverick1 Native ZFS filesystem kernel modules for Linux ii zfs-lib 0.6.0.5-0ubuntu3~maverick1 Native ZFS filesystem libraries for Linux

ZFS is exported with NFS and a large data 1400GB set is rsync to it. I am using a RAIDZ with 3 disks. L2ARC is enabled with 48GB of SSD. RAM is 6GB. CPU is an AMD E350 dual core.

All traffic is from NFS R&W.

root@nfs:~# ifconfig eth0 eth0 Link encap:Ethernet HWaddr inet6 addr: fe80::6e62:6dff:fedc:ca64/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:847730939 errors:0 dropped:0 overruns:0 frame:0 TX packets:823901907 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:681564940769 (681.5 GB) TX bytes:1152252590165 (1.1 TB)

root@nfs:~# uname -a Linux nfs 3.0.0-8-server #11~lucid1-Ubuntu SMP Wed Aug 17 10:27:02 UTC 2011 x86_64 GNU/Linux

[45270.219666] nfsd: non-standard errno: -75 [146363.704947] nfs ffff8801cee45d10 0000000000000046 ffff8801cee45cc0 0000000300000001 [328814.290381] 0000000000012a80 ffff8801cee45fd8 ffff8801cee44010 0000000000012a80 [328814.290394] ffff8801cee45fd8 0000000000012a80 ffff8801d9feade0 ffff8801d9c996f0 [328814.290406] Call Trace: [328814.290429] [] ? prepare_to_wait_exclusive+0x60/0x90 [328814.290466] [] cv_wait_common+0x78/0xe0 [spl] [328814.290478] [] ? wake_up_bit+0x40/0x40 [328814.290504] [] cv_wait+0x13/0x20 [spl] [328814.290612] [] zio_wait+0xeb/0x160 [zfs] [328814.290666] [] l2arc_feed_thread+0x64d/0x870 [zfs] [328814.290723] [] ? arc_release_bp+0x20/0x20 [zfs] [328814.290745] [] ? thread_create+0x160/0x160 [spl] [328814.290767] [] thread_generic_wrapper+0x78/0x90 [spl] [328814.290778] [] kthread+0x96/0xa0 [328814.290791] [] kernel_thread_helper+0x4/0x10 [328814.290803] [] ? kthread_worker_fn+0x190/0x190 [328814.290813] [] ? gs_change+0x13/0x13 [328814.290846] INFO: task txg_sync:1693 blocked for more than 120 seconds. [328814.290970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [328814.291115] txg_sync D ffff8801ccd8b1a0 0 1693 2 0x00000000 [328814.291127] ffff8801c2c03bf0 0000000000000046 ffff8801cccd5c08 ffff8801e7b12af0 [328814.291140] 0000000000012a80 ffff8801c2c03fd8 ffff8801c2c02010 0000000000012a80 [328814.291152] ffff8801c2c03fd8 0000000000012a80 ffff8801d9feade0 ffff8801ccd8ade0 [328814.291164] Call Trace: [328814.291176] [] mutex_lock_slowpath+0xdf/0x160 [328814.291189] [] mutex_lock+0x23/0x40 [328814.291213] [] cv_wait_common+0x80/0xe0 [spl] [328814.291224] [] ? wake_up_bit+0x40/0x40 [328814.291247] [] __cv_wait+0x13/0x20 [spl] [328814.291331] [] zio_wait+0xeb/0x160 [zfs] [328814.291409] [] spa_sync+0x3db/0x9a0 [zfs] [328814.291421] [] ? autoremove_wake_function+0x16/0x40 [328814.291433] [] ? wake_up+0x53/0x70 [328814.291513] [] txg_sync_thread+0x225/0x3b0 [zfs] [328814.291525] [] ? kfree+0x100/0x130 [328814.291605] [] ? txg_thread_exit+0x40/0x40 [zfs] [328814.291627] [] ? thread_create+0x160/0x160 [spl] [328814.291649] [] thread_generic_wrapper+0x78/0x90 [spl] [328814.291660] [] kthread+0x96/0xa0 [328814.291671] [] kernel_thread_helper+0x4/0x10 [328814.291682] [] ? kthread_worker_fn+0x190/0x190 [328814.291692] [] ? gs_change+0x13/0x13 [328814.291704] INFO: task nfsd:4568 blocked for more than 120 seconds. [328814.291822] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [328814.291967] nfsd D ffff8801da64c890 0 4568 2 0x00000000 [328814.291979] ffff880127df3780 0000000000000046 ffff880127df3760 ffffffff8105b470 [328814.291991] 0000000000012a80 ffff880127df3fd8 ffff880127df2010 0000000000012a80 [328814.292002] ffff880127df3fd8 0000000000012a80 ffff8801d98a16f0 ffff8801da64c4d0 [328814.292014] Call Trace: [328814.292026] [] ? try_to_wake_up+0x230/0x2b0 [328814.292036] [] ? prepare_to_wait_exclusive+0x60/0x90 [328814.292061] [] cv_wait_common+0x78/0xe0 [spl] [328814.292071] [] ? wake_up_bit+0x40/0x40 [328814.292094] [] __cv_wait+0x13/0x20 [spl] [328814.292174] [] txg_wait_open+0x7b/0xa0 [zfs] [328814.292239] [] dmu_tx_wait+0xed/0xf0 [zfs] [328814.292320] [] zfs_write+0x3b6/0xc90 [zfs] [328814.292386] [] ? dnode_rele+0x54/0x90 [zfs] [328814.292397] [] ? _raw_spin_lock+0xe/0x20 [328814.292409] [<ffff ? prepare_to_wait_exclusive+0x60/0x90 [328814.293714] [] cv_wait_common+0x78/0xe0 [spl] [328814.293724] [] ? wake_up_bit+0x40/0x40 [328814.293747] [] cv_wait+0x13/0x20 [spl] [328814.293828] [] txg_wait_open+0x7b/0xa0 [zfs] [328814.293892] [] dmu_tx_wait+0xed/0xf0 [zfs] [328814.293972] [] zfs_write+0x3b6/0xc90 [zfs] [328814.294037] [] ? dnode_rele+0x54/0x90 [zfs] [328814.294049] [] ? _raw_spin_lock+0xe/0x20 [328814.294060] [] ? iput+0x2c/0x50 [328814.294072] [] ? find_acceptable_alias+0x2a/0x130 [328814.294152] [] zpl_write_common+0x52/0x70 [zfs] [328814.294231] [] zpl_write+0x68/0xa0 [zfs] [328814.294242] [] ? kmalloc+0xe0/0x150 [328814.294320] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.294332] [] do_loop_readv_writev+0x59/0x90 [328814.294343] [] do_readv_writev+0x1ce/0x1e0 [328814.294417] [] ? rrw_exit+0x3e/0x140 [zfs] [328814.294497] [] ? zfs_open+0x9f/0x140 [zfs] [328814.294575] [] ? zpl_open+0x71/0x90 [zfs] [328814.294653] [] ? zpl_release+0x70/0x70 [zfs] [328814.294666] [] vfs_writev+0x48/0x60 [328814.294687] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [328814.294699] [] ? dentry_open+0x3b/0x50 [328814.294720] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [328814.294743] [] nfsd_write+0xe7/0x100 [nfsd] [328814.294766] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [328814.294790] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [328814.294810] [] nfsd_dispatch+0xfe/0x240 [nfsd] [328814.294844] [] svc_process_common+0x344/0x640 [sunrpc] [328814.294858] [] ? try_to_wake_up+0x2b0/0x2b0 [328814.294877] [] ? nfsd_svc+0x120/0x120 [nfsd] [328814.294909] [] svc_process+0x110/0x160 [sunrpc] [328814.294928] [] nfsd+0xc5/0x170 [nfsd] [328814.294939] [] kthread+0x96/0xa0 [328814.294950] [] kernel_thread_helper+0x4/0x10 [328814.294960] [] ? kthread_worker_fn+0x190/0x190 [328814.294971] [] ? gs_change+0x13/0x13 [328814.294979] INFO: task nfsd:4570 blocked for more than 120 seconds. [328814.295096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [328814.295241] nfsd D ffff8801ccd89ab0 0 4570 2 0x00000000 [328814.295252] ffff8801260a5780 0000000000000046 ffff8801260a5700 ffffffff00000000 [328814.295264] 0000000000012a80 ffff8801260a5fd8 ffff8801260a4010 0000000000012a80 [328814.295275] ffff8801260a5fd8 0000000000012a80 ffff8801dd720000 ffff8801ccd896f0 [328814.295287] Call Trace: [328814.295311] [] cv_wait_common+0x78/0xe0 [spl] [328814.295322] [] ? wake_up_bit+0x40/0x40 [328814.295345] [] __cv_wait+0x13/0x20 [spl] [328814.295425] [] txg_wait_open+0x7b/0xa0 [zfs] [328814.295490] [] dmu_tx_wait+0xed/0xf0 [zfs] [328814.295570] [] zfs_write+0x3b6/0xc90 [zfs] [328814.295634] [] ? dnode_rele+0x54/0x90 [zfs] [328814.295646] [] ? _raw_spin_lock+0xe/0x20 [328814.295657] [] ? iput+0x2c/0x50 [328814.295669] [] ? find_acceptable_alias+0x2a/0x130 [328814.295748] [] zpl_write_common+0x52/0x70 [zfs] [328814.295826] [] zpl_write+0x68/0xa0 [zfs] [328814.295837] [] ? kmalloc+0xe0/0x150 [328814.295914] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.295927] [] do_loop_readv_writev+0x59/0x90 [328814.295938] [] do_readv_writev+0x1ce/0x1e0 [3a057eccd>] dmu_tx_wait+0xed/0xf0 [zfs] [328814.297182] [] zfs_write+0x3b6/0xc90 [zfs] [328814.297247] [] ? dnode_rele+0x54/0x90 [zfs] [328814.297258] [] ? _raw_spin_lock+0xe/0x20 [328814.297269] [] ? iput+0x2c/0x50 [328814.297281] [] ? find_acceptable_alias+0x2a/0x130 [328814.297360] [] zpl_write_common+0x52/0x70 [zfs] [328814.297440] [] zpl_write+0x68/0xa0 [zfs] [328814.297451] [] ? kmalloc+0x39/0x150 [328814.297528] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.297540] [] do_loop_readv_writev+0x59/0x90 [328814.297552] [] do_readv_writev+0x1ce/0x1e0 [328814.297626] [] ? rrw_exit+0x3e/0x140 [zfs] [328814.297705] [] ? zfs_open+0x9f/0x140 [zfs] [328814.297783] [] ? zpl_open+0x71/0x90 [zfs] [328814.297860] [] ? zpl_release+0x70/0x70 [zfs] [328814.297873] [] vfs_writev+0x48/0x60 [328814.297894] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [328814.297906] [] ? dentry_open+0x3b/0x50 [328814.297927] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [328814.297949] [] nfsd_write+0xe7/0x100 [nfsd] [328814.297972] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [328814.297996] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [328814.298016] [] nfsd_dispatch+0xfe/0x240 [nfsd] [328814.298049] [] svc_process_common+0x344/0x640 [sunrpc] [328814.298063] [] ? try_to_wake_up+0x2b0/0x2b0 [328814.298082] [] ? nfsd_svc+0x120/0x120 [nfsd] [328814.298114] [] svc_process+0x110/0x160 [sunrpc] [328814.298133] [] nfsd+0xc5/0x170 [nfsd] [328814.298144] [] kthread+0x96/0xa0 [328814.298155] [] kernel_thread_helper+0x4/0x10 [328814.298165] [] ? kthread_worker_fn+0x190/0x190 [328814.298176] [] ? gs_change+0x13/0x13 [328814.298184] INFO: task nfsd:4572 blocked for more than 120 seconds. [328814.298301] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [328814.298446] nfsd D ffff8801ccd8df80 0 4572 2 0x00000000 [328814.298458] ffff880119edb780 0000000000000046 ffff880119edb700 ffff8801c2c01d90 [328814.298469] 0000000000012a80 ffff880119edbfd8 ffff880119eda010 0000000000012a80 [328814.298480] ffff880119edbfd8 0000000000012a80 ffff8801ccd8c4d0 ffff8801ccd8dbc0 [328814.298492] Call Trace: [328814.298502] [] ? prepare_to_wait_exclusive+0x60/0x90 [328814.298527] [] cv_wait_common+0x78/0xe0 [spl] [328814.298537] [] ? wake_up_bit+0x40/0x40 [328814.298559] [] cv_wait+0x13/0x20 [spl] [328814.298639] [] txg_wait_open+0x7b/0xa0 [zfs] [328814.298703] [] dmu_tx_wait+0xed/0xf0 [zfs] [328814.298783] [] zfs_write+0x3b6/0xc90 [zfs] [328814.298798] [] ? find_acceptable_alias+0x2a/0x130 [328814.298877] [] zpl_write_common+0x52/0x70 [zfs] [328814.298956] [] zpl_write+0x68/0xa0 [zfs] [328814.298966] [] ? kmalloc+0x39/0x150 [328814.299044] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.299056] [] do_loop_readv_writev+0x59/0x90 [328814.299067] [] do_readv_writev+0x1ce/0x1e0 [328814.299142] [] ? rrw_exit+0x3e/0x140 [zfs] [328814.299221] [] ? zfs_open+0x9f/0x140 [zfs] [328814.299300] [] ? zpl_open+0x71/0x90 [zfs] [328814.299377] [] ? zpl_release+0x70/0x70 [zfs] [328814.299390] [] vfs_writev+0x48/0x60 [328814.299411] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [328814.299423] [] ? dentrt+0xed/0xf0 [zfs] [328814.300352] [] zfs_write+0x3b6/0xc90 [zfs] [328814.300417] [] ? dnode_rele+0x54/0x90 [zfs] [328814.300429] [] ? _raw_spin_lock+0xe/0x20 [328814.300439] [] ? iput+0x2c/0x50 [328814.300452] [] ? find_acceptable_alias+0x2a/0x130 [328814.300531] [] zpl_write_common+0x52/0x70 [zfs] [328814.300610] [] zpl_write+0x68/0xa0 [zfs] [328814.300621] [] ? kmalloc+0xe0/0x150 [328814.300699] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.300711] [] do_loop_readv_writev+0x59/0x90 [328814.300722] [] do_readv_writev+0x1ce/0x1e0 [328814.300796] [] ? rrw_exit+0x3e/0x140 [zfs] [328814.300875] [] ? zfs_open+0x9f/0x140 [zfs] [328814.300953] [] ? zpl_open+0x71/0x90 [zfs] [328814.301031] [] ? zpl_release+0x70/0x70 [zfs] [328814.301043] [] vfs_writev+0x48/0x60 [328814.301064] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [328814.301076] [] ? dentry_open+0x3b/0x50 [328814.301096] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [328814.301119] [] nfsd_write+0xe7/0x100 [nfsd] [328814.301142] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [328814.301166] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [328814.301186] [] nfsd_dispatch+0xfe/0x240 [nfsd] [328814.301219] [] svc_process_common+0x344/0x640 [sunrpc] [328814.301232] [] ? try_to_wake_up+0x2b0/0x2b0 [328814.301251] [] ? nfsd_svc+0x120/0x120 [nfsd] [328814.301283] [] svc_process+0x110/0x160 [sunrpc] [328814.301303] [] nfsd+0xc5/0x170 [nfsd] [328814.301313] [] kthread+0x96/0xa0 [328814.301324] [] kernel_thread_helper+0x4/0x10 [328814.301334] [] ? kthread_worker_fn+0x190/0x190 [328814.301344] [] ? gs_change+0x13/0x13 [328814.301353] INFO: task nfsd:4574 blocked for more than 120 seconds. [328814.301475] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [328814.301620] nfsd D ffff8801d98a4890 0 4574 2 0x00000000 [328814.301631] ffff880172409780 0000000000000046 ffff880172409760 ffffffff8105b470 [328814.301643] 0000000000012a80 ffff880172409fd8 ffff880172408010 0000000000012a80 [328814.301655] ffff880172409fd8 0000000000012a80 ffff8801da64ade0 ffff8801d98a44d0 [328814.301666] Call Trace: [328814.301677] [] ? try_to_wake_up+0x230/0x2b0 [328814.301688] [] ? prepare_to_wait_exclusive+0x60/0x90 [328814.301713] [] cv_wait_common+0x78/0xe0 [spl] [328814.301723] [] ? wake_up_bit+0x40/0x40 [328814.301745] [] cv_wait+0x13/0x20 [spl] [328814.301825] [] txg_wait_open+0x7b/0xa0 [zfs] [328814.301889] [] dmu_tx_wait+0xed/0xf0 [zfs] [328814.301970] [] zfs_write+0x3b6/0xc90 [zfs] [328814.302035] [] ? dnode_rele+0x54/0x90 [zfs] [328814.302046] [] ? _raw_spin_lock+0xe/0x20 [328814.302057] [] ? iput+0x2c/0x50 [328814.302069] [] ? find_acceptable_alias+0x2a/0x130 [328814.302149] [] zpl_write_common+0x52/0x70 [zfs] [328814.302228] [] zpl_write+0x68/0xa0 [zfs] [328814.302239] [] ? __kmalloc+0x39/0x150 [328814.302316] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.302328] [] do_loop_readv_writev+0x59/0x90 [328814.302340] [] do_readv_writev+0x1ce/0x1e0 [328814.302414] [] ? rrw_exit+0x3e/0x140 [zfs] [328814.302493] [] ? zfs_open+0x9f/0x140 [zfs] [328814.302570] [] ? zpl_open+0x71/0x90 [zfs] [328814.3026/0xc90 [zfs] [328814.315141] [] ? dnode_rele+0x54/0x90 [zfs] [328814.315153] [] ? _raw_spin_lock+0xe/0x20 [328814.315164] [] ? iput+0x2c/0x50 [328814.315177] [] ? find_acceptable_alias+0x2a/0x130 [328814.315256] [] zpl_write_common+0x52/0x70 [zfs] [328814.315335] [] zpl_write+0x68/0xa0 [zfs] [328814.315346] [] ? kmalloc+0xe0/0x150 [328814.315424] [] ? zpl_write_common+0x70/0x70 [zfs] [328814.315436] [] do_loop_readv_writev+0x59/0x90 [328814.315448] [] do_readv_writev+0x1ce/0x1e0 [328814.315522] [] ? rrw_exit+0x3e/0x140 [zfs] [328814.315602] [] ? zfs_open+0x9f/0x140 [zfs] [328814.315679] [] ? zpl_open+0x71/0x90 [zfs] [328814.315757] [] ? zpl_release+0x70/0x70 [zfs] [328814.315769] [] vfs_writev+0x48/0x60 [328814.315792] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [328814.315803] [] ? dentry_open+0x3b/0x50 [328814.315824] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [328814.315848] [] nfsd_write+0xe7/0x100 [nfsd] [328814.315870] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [328814.315895] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [328814.315915] [] nfsd_dispatch+0xfe/0x240 [nfsd] [328814.315949] [] svc_process_common+0x344/0x640 [sunrpc] [328814.315964] [] ? try_to_wake_up+0x2b0/0x2b0 [328814.315982] [] ? nfsd_svc+0x120/0x120 [nfsd] [328814.316014] [] svc_process+0x110/0x160 [sunrpc] [328814.316034] [] nfsd+0xc5/0x170 [nfsd] [328814.316044] [] kthread+0x96/0xa0 [328814.316055] [] kernel_thread_helper+0x4/0x10 [328814.316065] [] ? kthread_worker_fn+0x190/0x190 [328814.316076] [] ? gs_change+0x13/0x13 [328869.430002] INFO: rcu_sched_state detected stall on CPU 0 (t=24030 jiffies) [329049.730002] INFO: rcu_sched_state detected stall on CPU 0 (t=42060 jiffies) [329230.030002] INFO: rcu_sched_state detected stall on CPU 0 (t=60090 jiffies) [329410.330002] INFO: rcu_sched_state detected stall on CPU 0 (t=78120 jiffies) [329590.630002] INFO: rcu_sched_state detected stall on CPU 0 (t=96150 jiffies) [329770.930002] INFO: rcu_sched_state detected stall on CPU 0 (t=114180 jiffies) [329951.230002] INFO: rcu_sched_state detected stall on CPU 0 (t=132210 jiffies) [330131.530001] INFO: rcu_sched_state detected stall on CPU 0 (t=150240 jiffies) [330311.830002] INFO: rcu_sched_state detected stall on CPU 0 (t=168270 jiffies

After reboot: zpool iostat -v 3 capacity operations bandwidth pool alloc free read write read write


store 3.66T 1.77T 171 29 11.6M 510K raidz1 3.66T 1.77T 171 29 11.6M 510K sda - - 103 9 3.90M 257K sdb - - 98 9 3.89M 257K sdd - - 104 9 3.90M 257K cache - - - - - - sdc3 4.90G 43.5G 1 16 141K 1.86M


zfsfabien commented 13 years ago

Another crash but much quicker this time. Using rsync over nfs again. Compress=on and dedup=on

ifconfig eth0 eth0 Link encap:Ethernet HWaddr inet6 addr: fe80::6e62:6dff:fedc:ca64/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:146181441 errors:0 dropped:0 overruns:0 frame:0 TX packets:138138341 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:120975123804 (120.9 GB) TX bytes:193813038437 (193.8 GB) Interrupt:40 Base address:0x8000

[ 60.903051] SPL: Loaded module v0.6.0.31, using hostid 0x007f0101 [ 60.905405] zunicode: module license 'CDDL' taints kernel. [ 60.905413] Disabling lock debugging due to kernel taint [ 68.809078] ZFS: Loaded module v0.6.0.31, ZFS pool version 28, ZFS filesystem version 5 [ 102.361292] nfsd: last server has exited, flushing export cache [ 103.574043] svc: failed to register lockdv1 RPC service (errno 97). [ 103.574220] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 103.574295] NFSD: starting 90-second grace period [ 678.336007] nfsd: last server has exited, flushing export cache [ 679.491685] svc: failed to register lockdv1 RPC service (errno 97). [ 679.491854] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 679.491927] NFSD: starting 90-second grace period [ 679.603501] nfsd: last server has exited, flushing export cache [ 680.819051] svc: failed to register lockdv1 RPC service (errno 97). [ 680.819220] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 680.819291] NFSD: starting 90-second grace period [ 3315.289411] nfsd: non-standard errno: -75 [ 3316.047270] nfsd: non-standard errno: -75 [ 3316.706336] nfsd: non-standard errno: -75 [ 3317.166289] nfsd: non-standard errno: -75 [ 3317.726667] nfsd: non-standard errno: -75 [14804.390094] usb 3-2: new high speed USB device number 2 using ehci_hcd [14811.507858] usb 3-2: USB disconnect, device number 2 [44725.710001] INFO: rcu_sched_state detected stall on CPU 0 (t=6000 jiffies) [44882.020104] INFO: task txg_sync:1685 blocked for more than 120 seconds. [44882.020247] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [44882.020392] txg_sync D ffff8801cbb44890 0 1685 2 0x00000000 [44882.020408] ffff8801c13a1b10 0000000000000046 ffff8801c13a1b6c ffff8801e7b12a80 [44882.020421] 0000000000012a80 ffff8801c13a1fd8 ffff8801c13a0010 0000000000012a80 [44882.020434] ffff8801c13a1fd8 0000000000012a80 ffff8801cbb544d0 ffff8801cbb444d0 [44882.020447] Call Trace: [44882.020468] [] mutex_lock_slowpath+0xdf/0x160 [44882.020482] [] mutex_lock+0x23/0x40 [44882.020516] [] cv_wait_common+0x80/0xe0 [spl] [44882.020529] [] ? wake_up_bit+0x40/0x40 [44882.020555] [] cv_wait+0x13/0x20 [spl] [44882.020661] [] zio_wait+0xeb/0x160 [zfs] [44882.020743] [] vdev_uberblock_sync_list+0x89/0x100 [zfs] [44882.020824] [] vdev_config_sync+0x13a/0x160 [zfs] [44882.020903] [] spa_sync+0x726/0x9a0 [zfs] [44882.020915] [] ? autoremove_wake_function+0x16/0x40 [44882.020927] [] ? wake_up+0x53/0x70 [44882.021006] [] txg_sync_thread+0x225/0x3b0 [zfs] [44882.021019] [] ? kfree+0x35/0x130 [44882.021097] [] ? txg_thread_exit+0x40/0x40 [zfs] [44882.021120] [] ? thread_create+0x160/0x160 [spl] [44882.021142] [] thread_generic_wrapper+0x78/0x90 [spl] [44882.021153] [] kthread+0x96/0xa0 [44882.021165] [] kernel_thread_helper+0x4/0x10 [44882.021176] [] ? kthread_worker_fn+0x190/0x190 [44882.021186] [] ? gs_change+0x13/0x13 [44882.021197] INFO: task nfsd:1885 blocked for more than 120 seconds. [44882.021314] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [44882.021457] nfsd D ffff8801cbb1df80 0 1885 2 0x00000000 [44882.021469] ffff8801a75d9780 0000000000000046 ffff8801a75d9700 ffff8801c139fd90 [44882.021481] 0000000000012a80 ffff8801a75d9fd8 ffff8801a75d8010 0000000000012a80 [44882.021493] ffff8801a75d9fd8 0000000000012a80 ffff8801d9a016f0 ffff8801cbb1dbc0 [44882.021505] Call Trace: [44882.021515] [] ? prepare_to_wait_exclusive+0x60/0x90 [44882.021540] [] cv_wait_common+0x78/0xe0 [spl] [44882.021550] [] ? wake_up_bit+0x40/0x40 [44882.021573] [] cv_wait+0x13/0x20 [spl] [44882.021654] [] txg_wait_open+0x7b/0xa0 [zfs] [44882.021718] [] dmu_tx_wait+0xed/0xf0 [zfs] [44882.021799] [] zfs_write+0x3b6/0xc90 [zfs] [44882.021816] [] ? find_acceptable_alias+0x2a/0x130 [44882.021895] [] zpl_write_common+0x52/0x70 [zfs] [44882.021974] [] zpl_write+0x68/0xa0 [zfs] [44882.022052] [] ? zpl_write_common+0x70/0x70 [zfs] [44882.022065] [] do_loop_readv_writev+0x59/0x90 [44882.022077] [] do_readv_writev+0x1ce/0x1e0 [44882.022155] [] ? zpl_open+0x71/0x90 [zfs] [44882.022233] [] ? zpl_release+0x70/0x70 [zfs] [44882.022246] [] vfs_writev+0x48/0x60 [44882.022272] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [44882.022284] [] ? dentry_open+0x3b/0x50 [44882.022306] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [44882.022332] [] nfsd_write+0xe7/0x100 [nfsd] [44882.022356] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [44882.022380] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [44882.022401] [] nfsd_dispatch+0xfe/0x240 [nfsd] [44882.022443] [] svc_process_common+0x344/0x640 [sunrpc] [44882.022458] [] ? try_to_wake_up+0x2b0/0x2b0 [44882.022477] [] ? nfsd_svc+0x120/0x120 [nfsd] [44882.022509] [] svc_process+0x110/0x160 [sunrpc] [44882.022529] [] nfsd+0xc5/0x170 [nfsd] [44882.022539] [] kthread+0x96/0xa0 [44882.022551] [] kernel_thread_helper+0x4/0x10 [44882.022561] [] ? kthread_worker_fn+0x190/0x190 [44882.022572] [] ? gs_change+0x13/0x13 [44882.022580] INFO: task nfsd:1887 blocked for more than 120 seconds. [44882.022697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [44882.022839] nfsd D ffff8801d988b1a0 0 1887 2 0x00000000 [44882.022852] ffff880129edd780 0000000000000046 ffff880129edd700 ffff880100000000 [44882.022863] 0000000000012a80 ffff880129eddfd8 ffff880129edc010 0000000000012a80 [44882.022875] ffff880129eddfd8 0000000000012a80 ffff8801dd720000 ffff8801d988ade0 [44882.022886] Call Trace: [44882.022911] [] cv_wait_common+0x78/0xe0 [spl] [44882.022921] [] ? wake_up_bit+0x40/0x40 [44882.022944] [] cv_wait+0x13/0x20 [spl] [44882.023024] [] txg_wait_open+0x7b/0xa0 [zfs] [44882.023088] [] dmu_tx_wait+0xed/0xf0 [zfs] [44882.023169] [] zfs_write+0x3b6/0xc90 [zfs] [44882.023185] [] ? find_acceptable_alias+0x2a/0x130 [44882.023264] [] zpl_write_common+0x52/0x70 [zfs] [44882.023343] [] zpl_write+0x68/0xa0 [zfs] [44882.023354] [] ? kmalloc+0x39/0x150 [44882.023432] [] ? zpl_write_common+0x70/0x70 [zfs] [44882.023444] [] do_loop_readv_writev+0x59/0x90 [44882.023456] [] do_readv_writev+0x1ce/0x1e0 [44882.023530] [] ? rrw_exit+0x3e/0x140 [zfs] [44882.023611] [] ? zfs_open+0x9f/0x140 [zfs] [44882.023689] [] ? zpl_open+0x71/0x90 [zfs] [44882.023766] [] ? zpl_release+0x70/0x70 [zfs] [44882.023779] [] vfs_writev+0x48/0x60 [44882.023801] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [44882.023812] [] ? dentry_open+0x3b/0x50 [44882.023833] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [44882.023856] [] nfsd_write+0xe7/0x100 [nfsd] [44882.023879] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [44882.023903] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [44882.023923] [] nfsd_dispatch+0xfe/0x240 [nfsd] [44882.023956] [] svc_process_common+0x344/0x640 [sunrpc] [44882.023970] [] ? try_to_wake_up+0x2b0/0x2b0 [44882.023989] [] ? nfsd_svc+0x120/0x120 [nfsd] [44882.024021] [] svc_process+0x110/0x160 [sunrpc] [44882.024040] [] nfsd+0xc5/0x170 [nfsd] [44882.024051] [] kthread+0x96/0xa0 [44882.024062] [] kernel_thread_helper+0x4/0x10 [44882.024072] [] ? kthread_worker_fn+0x190/0x190 [44882.024083] [] ? gs_change+0x13/0x13 [44882.024091] INFO: task nfsd:1888 blocked for more than 120 seconds. [44882.024207] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [44882.024349] nfsd D ffff8801d9a04890 0 1888 2 0x00000000 [44882.024361] ffff8801a7179780 0000000000000046 ffff8801a7179700 ffffffffa04d6366 [44882.024373] 0000000000012a80 ffff8801a7179fd8 ffff8801a7178010 0000000000012a80 [44882.024384] ffff8801a7179fd8 0000000000012a80 ffff8801cbb544d0 ffff8801d9a044d0 [44882.024395] Call Trace: [44882.024417] [] ? kmem_free_debug+0x16/0x20 [spl] [44882.024428] [] ? prepare_to_wait_exclusive+0x60/0x90 [44882.024452] [] cv_wait_common+0x78/0xe0 [spl] [44882.024462] [] ? wake_up_bit+0x40/0x40 [44882.024484] [] cv_wait+0x13/0x20 [spl] [44882.024565] [] txg_wait_open+0x7b/0xa0 [zfs] [44882.024629] [] dmu_tx_wait+0xed/0xf0 [zfs] [44882.024709] [] zfs_write+0x3b6/0xc90 [zfs] [44882.024774] [] ? dnode_rele+0x54/0x90 [zfs] [44882.024786] [] ? _raw_spin_lock+0xe/0x20 [44882.024798] [] ? iput+0x2c/0x50 [44882.024810] [] ? find_acceptable_alias+0x2a/0x130 [44882.024889] [] zpl_write_common+0x52/0x70 [zfs] [44882.024968] [] zpl_write+0x68/0xa0 [zfs] [44882.025046] [] ? zpl_write_common+0x70/0x70 [zfs] [44882.025058] [] do_loop_readv_writev+0x59/0x90 [44882.025070] [] do_readv_writev+0x1ce/0x1e0 [44882.025147] [] ? zpl_open+0x71/0x90 [zfs] [44882.025224] [] ? zpl_release+0x70/0x70 [zfs] [44882.025237] [] vfs_writev+0x48/0x60 [44882.025258] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [44882.025269] [] ? dentry_open+0x3b/0x50 [44882.025290] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [44882.025313] [] nfsd_write+0xe7/0x100 [nfsd] [44882.025336] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [44882.025359] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [44882.025380] [] nfsd_dispatch+0xfe/0x240 [nfsd] [44882.025413] [] svc_process_common+0x344/0x640 [sunrpc] [44882.025428] [] ? try_to_wake_up+0x2b0/0x2b0 [44882.025446] [] ? nfsd_svc+0x120/0x120 [nfsd] [44882.025478] [] svc_process+0x110/0x160 [sunrpc] [44882.025498] [] nfsd+0xc5/0x170 [nfsd] [44882.025508] [] kthread+0x96/0xa0 [44882.025519] [] kernel_thread_helper+0x4/0x10 [44882.025529] [] ? kthread_worker_fn+0x190/0x190 [44882.025539] [] ? gs_change+0x13/0x13 [44882.025548] INFO: task nfsd:1890 blocked for more than 120 seconds. [44882.025665] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [44882.025807] nfsd D ffff8801d9a01ab0 0 1890 2 0x00000000 [44882.025819] ffff8801a8ef9780 0000000000000046 ffff8801a8ef9700 ffffffffa04d6366 [44882.025830] 0000000000012a80 ffff8801a8ef9fd8 ffff8801a8ef8010 0000000000012a80 [44882.025841] ffff8801a8ef9fd8 0000000000012a80 ffff8801d9a044d0 ffff8801d9a016f0 [44882.025853] Call Trace: [44882.025874] [] ? kmem_free_debug+0x16/0x20 [spl] [44882.025885] [] ? prepare_to_wait_exclusive+0x60/0x90 [44882.025909] [] cv_wait_common+0x78/0xe0 [spl] [44882.025919] [] ? wake_up_bit+0x40/0x40 [44882.025942] [] cv_wait+0x13/0x20 [spl] [44882.026021] [] txg_wait_open+0x7b/0xa0 [zfs] [44882.026085] [] dmu_tx_wait+0xed/0xf0 [zfs] [44882.026166] [] zfs_write+0x3b6/0xc90 [zfs] [44882.026230] [] ? dnode_rele+0x54/0x90 [zfs] [44882.026241] [] ? _raw_spin_lock+0xe/0x20 [44882.026252] [] ? iput+0x2c/0x50 [44882.026264] [] ? find_acceptable_alias+0x2a/0x130 [44882.026344] [] zpl_write_common+0x52/0x70 [zfs] [44882.026423] [] zpl_write+0x68/0xa0 [zfs] [44882.026501] [] ? zpl_write_common+0x70/0x70 [zfs] [44882.026513] [] do_loop_readv_writev+0x59/0x90 [44882.026524] [] do_readv_writev+0x1ce/0x1e0 [44882.026602] [] ? zpl_open+0x71/0x90 [zfs] [44882.026679] [] ? zpl_release+0x70/0x70 [zfs] [44882.026692] [] vfs_writev+0x48/0x60 [44882.026713] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [44882.026725] [] ? dentry_open+0x3b/0x50 [44882.026746] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [44882.026768] [] nfsd_write+0xe7/0x100 [nfsd] [44882.026791] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [44882.026814] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [44882.026835] [] nfsd_dispatch+0xfe/0x240 [nfsd] [44882.026868] [] svc_process_common+0x344/0x640 [sunrpc] [44882.026882] [] ? try_to_wake_up+0x2b0/0x2b0 [44882.026901] [] ? nfsd_svc+0x120/0x120 [nfsd] [44882.026933] [] svc_process+0x110/0x160 [sunrpc] [44882.026952] [] nfsd+0xc5/0x170 [nfsd] [44882.026962] [] kthread+0x96/0xa0 [44882.026973] [] kernel_thread_helper+0x4/0x10 [44882.026984] [] ? kthread_worker_fn+0x190/0x190 [44882.026994] [] ? gs_change+0x13/0x13 [44906.010002] INFO: rcu_sched_state detected stall on CPU 0 (t=24030 jiffies) [45002.020104] INFO: task txg_sync:1685 blocked for more than 120 seconds. [45002.020248] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [45002.020392] txg_sync D ffff8801cbb44890 0 1685 2 0x00000000 [45002.020407] ffff8801c13a1b10 0000000000000046 ffff8801c13a1b6c ffff8801e7b12a80 [45002.020421] 0000000000012a80 ffff8801c13a1fd8 ffff8801c13a0010 0000000000012a80 [45002.020433] ffff8801c13a1fd8 0000000000012a80 ffff8801cbb544d0 ffff8801cbb444d0 [45002.020445] Call Trace: [45002.020467] [] mutex_lock_slowpath+0xdf/0x160 [45002.020480] [] mutex_lock+0x23/0x40 [45002.020515] [] cv_wait_common+0x80/0xe0 [spl] [45002.020527] [] ? wake_up_bit+0x40/0x40 [45002.020552] [] cv_wait+0x13/0x20 [spl] [45002.020658] [] zio_wait+0xeb/0x160 [zfs] [45002.020739] [] vdev_uberblock_sync_list+0x89/0x100 [zfs] [45002.020819] [] vdev_config_sync+0x13a/0x160 [zfs] [45002.020898] [] spa_sync+0x726/0x9a0 [zfs] [45002.020909] [] ? autoremove_wake_function+0x16/0x40 [45002.020921] [] ? wake_up+0x53/0x70 [45002.020999] [] txg_sync_thread+0x225/0x3b0 [zfs] [45002.021012] [] ? kfree+0x35/0x130 [45002.021090] [] ? txg_thread_exit+0x40/0x40 [zfs] [45002.021112] [] ? thread_create+0x160/0x160 [spl] [45002.021134] [] thread_generic_wrapper+0x78/0x90 [spl] [45002.021144] [] kthread+0x96/0xa0 [45002.021157] [] kernel_thread_helper+0x4/0x10 [45002.021167] [] ? kthread_worker_fn+0x190/0x190 [45002.021177] [] ? gs_change+0x13/0x13 [45002.021189] INFO: task nfsd:1885 blocked for more than 120 seconds. [45002.021305] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [45002.021448] nfsd D ffff8801cbb1df80 0 1885 2 0x00000000 [45002.021461] ffff8801a75d9780 0000000000000046 ffff8801a75d9700 ffff8801c139fd90 [45002.021473] 0000000000012a80 ffff8801a75d9fd8 ffff8801a75d8010 0000000000012a80 [45002.021484] ffff8801a75d9fd8 0000000000012a80 ffff8801d9a016f0 ffff8801cbb1dbc0 [45002.021496] Call Trace: [45002.021506] [] ? prepare_to_wait_exclusive+0x60/0x90 [45002.021531] [] cv_wait_common+0x78/0xe0 [spl] [45002.021541] [] ? wake_up_bit+0x40/0x40 [45002.021563] [] cv_wait+0x13/0x20 [spl] [45002.021645] [] txg_wait_open+0x7b/0xa0 [zfs] [45002.021709] [] dmu_tx_wait+0xed/0xf0 [zfs] [45002.021790] [] zfs_write+0x3b6/0xc90 [zfs] [45002.021807] [] ? find_acceptable_alias+0x2a/0x130 [45002.021887] [] zpl_write_common+0x52/0x70 [zfs] [45002.021965] [] zpl_write+0x68/0xa0 [zfs] [45002.022044] [] ? zpl_write_common+0x70/0x70 [zfs] [45002.022057] [] do_loop_readv_writev+0x59/0x90 [45002.022069] [] do_readv_writev+0x1ce/0x1e0 [45002.022147] [] ? zpl_open+0x71/0x90 [zfs] [45002.022224] [] ? zpl_release+0x70/0x70 [zfs] [45002.022237] [] vfs_writev+0x48/0x60 [45002.022264] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [45002.022275] [] ? dentry_open+0x3b/0x50 [45002.022297] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [45002.022322] [] nfsd_write+0xe7/0x100 [nfsd] [45002.022345] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [45002.022370] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [45002.022391] [] nfsd_dispatch+0xfe/0x240 [nfsd] [45002.022433] [] svc_process_common+0x344/0x640 [sunrpc] [45002.022448] [] ? try_to_wake_up+0x2b0/0x2b0 [45002.022467] [] ? nfsd_svc+0x120/0x120 [nfsd] [45002.022499] [] svc_process+0x110/0x160 [sunrpc] [45002.022518] [] nfsd+0xc5/0x170 [nfsd] [45002.022529] [] kthread+0x96/0xa0 [45002.022540] [] kernel_thread_helper+0x4/0x10 [45002.022551] [] ? kthread_worker_fn+0x190/0x190 [45002.022561] [] ? gs_change+0x13/0x13 [45002.022570] INFO: task nfsd:1886 blocked for more than 120 seconds. [45002.022686] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [45002.022829] nfsd D ffff8801d9889ab0 0 1886 2 0x00000000 [45002.022841] ffff8801a3505780 0000000000000046 ffff8801a3505700 ffffffff00000000 [45002.022852] 0000000000012a80 ffff8801a3505fd8 ffff8801a3504010 0000000000012a80 [45002.022864] ffff8801a3505fd8 0000000000012a80 ffff8801dd720000 ffff8801d98896f0 [45002.022875] Call Trace: [45002.022900] [] cv_wait_common+0x78/0xe0 [spl] [45002.022910] [] ? wake_up_bit+0x40/0x40 [45002.022933] [] cv_wait+0x13/0x20 [spl] [45002.023013] [] txg_wait_open+0x7b/0xa0 [zfs] [45002.023077] [] dmu_tx_wait+0xed/0xf0 [zfs] [45002.023157] [] zfs_write+0x3b6/0xc90 [zfs] [45002.023222] [] ? dnode_rele+0x54/0x90 [zfs] [45002.023234] [] ? _raw_spin_lock+0xe/0x20 [45002.023245] [] ? iput+0x2c/0x50 [45002.023258] [] ? find_acceptable_alias+0x2a/0x130 [45002.023337] [] zpl_write_common+0x52/0x70 [zfs] [45002.023415] [] zpl_write+0x68/0xa0 [zfs] [45002.023427] [] ? __kmalloc+0x39/0x150 [45002.023504] [] ? zpl_write_common+0x70/0x70 [zfs] [45002.023516] [] do_loop_readv_writev+0x59/0x90 [45002.023528] [] do_readv_writev+0x1ce/0x1e0 [45002.023603] [] ? rrw_exit+0x3e/0x140 [zfs] [45002.023682] [] ? zfs_open+0x9f/0x140 [zfs] [45002.023760] [] ? zpl_open+0x71/0x90 [zfs] [45002.023838] [] ? zpl_release+0x70/0x70 [zfs] [45002.023850] [] vfs_writev+0x48/0x60 [45002.023872] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [45002.023883] [] ? dentry_open+0x3b/0x50 [45002.023904] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [45002.023926] [] nfsd_write+0xe7/0x100 [nfsd] [45002.023949] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [45002.023973] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [45002.023993] [] nfsd_dispatch+0xfe/0x240 [nfsd] [45002.024027] [] svc_process_common+0x344/0x640 [sunrpc] [45002.024041] [] ? try_to_wake_up+0x2b0/0x2b0 [45002.024060] [] ? nfsd_svc+0x120/0x120 [nfsd] [45002.024092] [] svc_process+0x110/0x160 [sunrpc] [45002.024111] [] nfsd+0xc5/0x170 [nfsd] [45002.024122] [] kthread+0x96/0xa0 [45002.024133] [] kernel_thread_helper+0x4/0x10 [45002.024143] [] ? kthread_worker_fn+0x190/0x190 [45002.024154] [] ? gs_change+0x13/0x13 [45002.024162] INFO: task nfsd:1887 blocked for more than 120 seconds. [45002.024278] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [45002.024420] nfsd D ffff8801d988b1a0 0 1887 2 0x00000000 [45002.024432] ffff880129edd780 0000000000000046 ffff880129edd700 ffff880100000000 [45002.024443] 0000000000012a80 ffff880129eddfd8 ffff880129edc010 0000000000012a80 [45002.024454] ffff880129eddfd8 0000000000012a80 ffff8801dd720000 ffff8801d988ade0 [45002.024466] Call Trace: [45002.024490] [] cv_wait_common+0x78/0xe0 [spl] [45002.024500] [] ? wake_up_bit+0x40/0x40 [45002.024523] [] cv_wait+0x13/0x20 [spl] [45002.024603] [] txg_wait_open+0x7b/0xa0 [zfs] [45002.024667] [] dmu_tx_wait+0xed/0xf0 [zfs] [45002.024747] [] zfs_write+0x3b6/0xc90 [zfs] [45002.024762] [] ? find_acceptable_alias+0x2a/0x130 [45002.024840] [] zpl_write_common+0x52/0x70 [zfs] [45002.024919] [] zpl_write+0x68/0xa0 [zfs] [45002.024929] [] ? kmalloc+0x39/0x150 [45002.025007] [] ? zpl_write_common+0x70/0x70 [zfs] [45002.025019] [] do_loop_readv_writev+0x59/0x90 [45002.025030] [] do_readv_writev+0x1ce/0x1e0 [45002.025105] [] ? rrw_exit+0x3e/0x140 [zfs] [45002.025184] [] ? zfs_open+0x9f/0x140 [zfs] [45002.025262] [] ? zpl_open+0x71/0x90 [zfs] [45002.025339] [] ? zpl_release+0x70/0x70 [zfs] [45002.025352] [] vfs_writev+0x48/0x60 [45002.025373] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [45002.025385] [] ? dentry_open+0x3b/0x50 [45002.025406] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [45002.025428] [] nfsd_write+0xe7/0x100 [nfsd] [45002.025451] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [45002.025475] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [45002.025495] [] nfsd_dispatch+0xfe/0x240 [nfsd] [45002.025528] [] svc_process_common+0x344/0x640 [sunrpc] [45002.025542] [] ? try_to_wake_up+0x2b0/0x2b0 [45002.025560] [] ? nfsd_svc+0x120/0x120 [nfsd] [45002.025592] [] svc_process+0x110/0x160 [sunrpc] [45002.025612] [] nfsd+0xc5/0x170 [nfsd] [45002.025622] [] kthread+0x96/0xa0 [45002.025633] [] kernel_thread_helper+0x4/0x10 [45002.025643] [] ? kthread_worker_fn+0x190/0x190 [45002.025654] [] ? gs_change+0x13/0x13 [45002.025662] INFO: task nfsd:1888 blocked for more than 120 seconds. [45002.025778] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [45002.025920] nfsd D ffff8801d9a04890 0 1888 2 0x00000000 [45002.025932] ffff8801a7179780 0000000000000046 ffff8801a7179700 ffffffffa04d6366 [45002.025943] 0000000000012a80 ffff8801a7179fd8 ffff8801a7178010 0000000000012a80 [45002.025954] ffff8801a7179fd8 0000000000012a80 ffff8801cbb544d0 ffff8801d9a044d0 [45002.025965] Call Trace: [45002.025986] [] ? kmem_free_debug+0x16/0x20 [spl] [45002.025998] [] ? prepare_to_wait_exclusive+0x60/0x90 [45002.026021] [] cv_wait_common+0x78/0xe0 [spl] [45002.026031] [] ? wake_up_bit+0x40/0x40 [45002.026054] [] cv_wait+0x13/0x20 [spl] [45002.026134] [] txg_wait_open+0x7b/0xa0 [zfs] [45002.026198] [] dmu_tx_wait+0xed/0xf0 [zfs] [45002.026279] [] zfs_write+0x3b6/0xc90 [zfs] [45002.026343] [] ? dnode_rele+0x54/0x90 [zfs] [45002.026355] [] ? _raw_spin_lock+0xe/0x20 [45002.026365] [] ? iput+0x2c/0x50 [45002.026378] [] ? find_acceptable_alias+0x2a/0x130 [45002.026457] [] zpl_write_common+0x52/0x70 [zfs] [45002.026536] [] zpl_write+0x68/0xa0 [zfs] [45002.026614] [] ? zpl_write_common+0x70/0x70 [zfs] [45002.026626] [] do_loop_readv_writev+0x59/0x90 [45002.026638] [] do_readv_writev+0x1ce/0x1e0 [45002.026716] [] ? zpl_open+0x71/0x90 [zfs] [45002.026793] [] ? zpl_release+0x70/0x70 [zfs] [45002.026805] [] vfs_writev+0x48/0x60 [45002.026827] [] nfsd_vfs_write+0x100/0x3b0 [nfsd] [45002.026838] [] ? dentry_open+0x3b/0x50 [45002.026859] [] ? nfsd_open+0x10e/0x1a0 [nfsd] [45002.026881] [] nfsd_write+0xe7/0x100 [nfsd] [45002.026904] [] ? nfsd_cache_lookup+0x34c/0x410 [nfsd] [45002.026928] [] nfsd3_proc_write+0xaf/0x140 [nfsd] [45002.026948] [] nfsd_dispatch+0xfe/0x240 [nfsd] [45002.026981] [] svc_process_common+0x344/0x640 [sunrpc] [45002.026995] [] ? try_to_wake_up+0x2b0/0x2b0 [45002.027014] [] ? nfsd_svc+0x120/0x120 [nfsd] [45002.027046] [] svc_process+0x110/0x160 [sunrpc] [45002.027065] [] nfsd+0xc5/0x170 [nfsd] [45002.027075] [] kthread+0x96/0xa0 [45002.027086] [] kernel_thread_helper+0x4/0x10 [45002.027097] [] ? kthread_worker_fn+0x190/0x190 [45002.027107] [] ? gs_change+0x13/0x13 [45086.310002] INFO: rcu_sched_state detected stall on CPU 0 (t=42060 jiffies) [45266.610001] INFO: rcu_sched_state detected stall on CPU 0 (t=60090 jiffies) [45446.910002] INFO: rcu_sched_state detected stall on CPU 0 (t=78120 jiffies) [45627.210002] INFO: rcu_sched_state detected stall on CPU 0 (t=96150 jiffies) [45807.510002] INFO: rcu_sched_state detected stall on CPU 0 (t=114180 jiffies) [45987.810001] INFO: rcu_sched_state detected stall on CPU 0 (t=132210 jiffies) [46168.110001] INFO: rcu_sched_state detected stall on CPU 0 (t=150240 jiffies) [46348.410001] INFO: rcu_sched_state detected stall on CPU 0 (t=168270 jiffies) [46528.710001] INFO: rcu_sched_state detected stall on CPU 0 (t=186300 jiffies) [46709.010001] INFO: rcu_sched_state detected stall on CPU 0 (t=204330 jiffies) [46889.310002] INFO: rcu_sched_state detected stall on CPU 0 (t=222360 jiffies) [47069.610001] INFO: rcu_sched_state detected stall on CPU 0 (t=240390 jiffies) [47249.910002] INFO: rcu_sched_state detected stall on CPU 0 (t=258420 jiffies) [47430.210002] INFO: rcu_sched_state detected stall on CPU 0 (t=276450 jiffies) [47610.510002] INFO: rcu_sched_state detected stall on CPU 0 (t=294480 jiffies) [47790.810002] INFO: rcu_sched_state detected stall on CPU 0 (t=312510 jiffies) [47971.110002] INFO: rcu_sched_state detected stall on CPU 0 (t=330540 jiffies) [48151.410002] INFO: rcu_sched_state detected stall on CPU 0 (t=348570 jiffies) [48331.710001] INFO: rcu_sched_state detected stall on CPU 0 (t=366600 jiffies) [48512.010001] INFO: rcu_sched_state detected stall on CPU 0 (t=384630 jiffies) [48692.310002] INFO: rcu_sched_state detected stall on CPU 0 (t=402660 jiffies) [48872.610002] INFO: rcu_sched_state detected stall on CPU 0 (t=420690 jiffies) [49052.910002] INFO: rcu_sched_state detected stall on CPU 0 (t=438720 jiffies) [49233.210001] INFO: rcu_sched_state detected stall on CPU 0 (t=456750 jiffies) [49413.510001] INFO: rcu_sched_state detected stall on CPU 0 (t=474780 jiffies) [49593.810002] INFO: rcu_sched_state detected stall on CPU 0 (t=492810 jiffies) [49774.110002] INFO: rcu_sched_state detected stall on CPU 0 (t=510840 jiffies) [49954.410002] INFO: rcu_sched_state detected stall on CPU 0 (t=528870 jiffies) [50134.710001] INFO: rcu_sched_state detected stall on CPU 0 (t=546900 jiffies) [50315.010001] INFO: rcu_sched_state detected stall on CPU 0 (t=564930 jiffies) [50495.310002] INFO: rcu_sched_state detected stall on CPU 0 (t=582960 jiffies) [50675.610001] INFO: rcu_sched_state detected stall on CPU 0 (t=600990 jiffies) [50855.910001] INFO: rcu_sched_state detected stall on CPU 0 (t=619020 jiffies) [51036.210002] INFO: rcu_sched_state detected stall on CPU 0 (t=637050 jiffies) [51216.510001] INFO: rcu_sched_state detected stall on CPU 0 (t=655080 jiffies) [51396.810001] INFO: rcu_sched_state detected stall on CPU 0 (t=673110 jiffies) [51577.110001] INFO: rcu_sched_state detected stall on CPU 0 (t=691140 jiffies) [51757.410002] INFO: rcu_sched_state detected stall on CPU 0 (t=709170 jiffies) [51937.710001] INFO: rcu_sched_state detected stall on CPU 0 (t=727200 jiffies) [52118.010002] INFO: rcu_sched_state detected stall on CPU 0 (t=745230 jiffies) [52298.310001] INFO: rcu_sched_state detected stall on CPU 0 (t=763260 jiffies) [52478.610002] INFO: rcu_sched_state detected stall on CPU 0 (t=781290 jiffies) [52658.910002] INFO: rcu_sched_state detected stall on CPU 0 (t=799320 jiffies) [52839.210002] INFO: rcu_sched_state detected stall on CPU 0 (t=817350 jiffies) [53019.510002] INFO: rcu_sched_state detected stall on CPU 0 (t=835380 jiffies) [53199.810001] INFO: rcu_sched_state detected stall on CPU 0 (t=853410 jiffies) [53380.110002] INFO: rcu_sched_state detected stall on CPU 0 (t=871440 jiffies) [53560.410001] INFO: rcu_sched_state detected stall on CPU 0 (t=889470 jiffies) [53740.710002] INFO: rcu_sched_state detected stall on CPU 0 (t=907500 jiffies) [53921.010002] INFO: rcu_schedstate detected stall on CPU 0 (t=925530 jiffies) [54101.310001] INFO: rcu

top - 06:48:16 up 16:53, 2 users, load average: 28.00, 28.01, 27.98 Tasks: 255 total, 19 running, 236 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 50.0%sy, 0.0%ni, 0.0%id, 49.8%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 6379384k total, 4383480k used, 1995904k free, 79964k buffers Swap: 376828k total, 0k used, 376828k free, 106292k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3068 root 20 0 19404 1432 948 R 1 0.0 0:00.08 top
3 root 20 0 0 0 0 R 0 0.0 2:13.76 ksoftirqd/0
4 root 20 0 0 0 0 R 0 0.0 26:15.00 kworker/0:0
895 root 20 0 0 0 0 D 0 0.0 0:00.03 jbd2/dm-2-8
1529 root 39 19 0 0 0 R 0 0.0 0:44.77 z_null_int/0
1542 root 39 19 0 0 0 R 0 0.0 0:29.95 z_wr_iss_h/0
1544 root 39 19 0 0 0 R 0 0.0 0:30.26 z_wr_iss_h/2
1546 root 39 19 0 0 0 R 0 0.0 0:30.43 z_wr_iss_h/4
1547 root 39 19 0 0 0 R 0 0.0 4:27.04 z_wr_int/0
1549 root 39 19 0 0 0 R 0 0.0 4:26.92 z_wr_int/2
1551 root 39 19 0 0 0 R 0 0.0 4:32.37 z_wr_int/4
1553 root 39 19 0 0 0 R 0 0.0 4:41.01 z_wr_int/6
1555 root 39 19 0 0 0 R 0 0.0 4:20.73 z_wr_int/8
1557 root 39 19 0 0 0 R 0 0.0 4:28.06 z_wr_int/10
1559 root 39 19 0 0 0 R 0 0.0 4:27.25 z_wr_int/12
1561 root 39 19 0 0 0 R 0 0.0 4:32.24 z_wr_int/14
1563 root 39 19 0 0 0 R 0 0.0 0:21.83 z_wr_int_h/0
1565 root 39 19 0 0 0 R 0 0.0 0:21.80 z_wr_int_h/2
1567 root 39 19 0 0 0 R 0 0.0 0:22.45 z_wr_int_h/4
1684 root 0 -20 0 0 0 R 0 0.0 0:26.25 txg_quiesce
1685 root 0 -20 0 0 0 D 0 0.0 10:00.99 txg_sync
1884 root 20 0 0 0 0 D 0 0.0 12:12.77 nfsd
1885 root 20 0 0 0 0 D 0 0.0 12:13.04 nfsd
1886 root 20 0 0 0 0 D 0 0.0 12:09.94 nfsd
1887 root 20 0 0 0 0 D 0 0.0 12:24.22 nfsd
1888 root 20 0 0 0 0 D 0 0.0 13:25.27 nfsd
1889 root 20 0 0 0 0 D 0 0.0 12:47.74 nfsd
1890 root 20 0 0 0 0 D 0 0.0 12:25.28 nfsd

Linux nfs 3.0.0-8-server #11~lucid1-Ubuntu SMP Wed Aug 17 10:27:02 UTC 2011 x86_64 GNU/Linux

root@nfs:/store/memotech/data# free -m total used free shared buffers cached Mem: 6229 4281 1948 0 78 103 -/+ buffers/cache: 4099 2130 Swap: 367 0 367

root@nfs:/store/memotech/data# free -m total used free shared buffers cached Mem: 6229 4281 1948 0 78 103 -/+ buffers/cache: 4099 2130 Swap: 367 0 367

data# zpool iostat -v 3 capacity operations bandwidth pool alloc free read write read write


store 3.81T 1.63T 60 338 4.70M 4.31M raidz1 3.81T 1.63T 60 338 4.70M 4.31M sda - - 30 98 1.57M 2.18M sdb - - 30 109 1.57M 2.18M sdd - - 30 98 1.56M 2.18M cache - - - - - - sdc3 31.6G 16.8G 8 5 120K 545K


           capacity     operations    bandwidth

pool alloc free read write read write


store 3.81T 1.63T 0 0 0 0 raidz1 3.81T 1.63T 0 0 0 0 sda - - 0 0 0 0 sdb - - 0 0 0 0 sdd - - 0 0 0 0 cache - - - - - - sdc3 31.6G 16.8G 0 0 0 0


behlendorf commented 13 years ago

Thanks for the bug report, we'll look in to it.

zfsfabien commented 13 years ago

Here is an update: If the rsync speed is limited to 3MB/s the bug is not triggered after 7 days of rsync over NFS and about 1TB written and 1.5TB read of mostly small files < 5MB.

I guess the low power of the cpu and using a large L2ARC ssd help to trigger that "feature" when rsync speed is not limited.

zfsfabien commented 13 years ago

I was able to reproduce the crash without NFS:

[86910.260005] INFO: rcu_sched_state detected stall on CPU 0 (t=6000 jiffies) [139364.538026] BUG: unable to handle kernel NULL pointer dereference at (null) [139364.538129] IP: [] balance_pgdat+0x6f4/0x8f0 [139364.538200] PGD 0 [139364.538226] Oops: 0002 [#1] SMP [139364.538266] CPU 0 [139364.538289] Modules linked in: zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc snd_hda_codec_realtek radeon snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm ttm drm_kms_helper sp5100_tco snd_timer drm snd soundcore snd_page_alloc k10temp psmouse i2c_piix4 i2c_algo_bit shpchp serio_raw lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov hid_apple usbhid hid ahci mptsas mptscsih mptbase scsi_transport_sas pata_atiixp r8169 libahci raid6_pq async_tx raid1 raid0 multipath linear [139364.538963] [139364.538985] Pid: 24, comm: kswapd0 Tainted: P 3.0.0-8-server #11~lucid1-Ubuntu MSI MS-7698/E350IS-E45 (MS-7698) [139364.539097] RIP: 0010:[] [] balance_pgdat+0x6f4/0x8f0 [139364.539183] RSP: 0018:ffff8801da60dd60 EFLAGS: 00010286 [139364.539234] RAX: 00000000000001c8 RBX: 0000000000000000 RCX: 00000000000252fb [139364.539302] RDX: 0000000000000000 RSI: 00000000000001c8 RDI: ffff8801da60de30 [139364.539370] RBP: ffff8801da60de80 R08: 0000000000000001 R09: 0000000000000000 [139364.539438] R10: ffff8801e7df95a8 R11: ffff8801da5e16f0 R12: ffff8801e7df9000 [139364.540009] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000002 [139364.540009] FS: 00007fd15b322720(0000) GS:ffff8801e7a00000(0000) knlGS:0000000000000000 [139364.540009] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [139364.550206] CR2: 0000000000000000 CR3: 00000001ad27a000 CR4: 00000000000006f0 [139364.550206] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [139364.550206] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [139364.550206] Process kswapd0 (pid: 24, threadinfo ffff8801da60c000, task ffff8801da5e16f0) [139364.550206] Stack: [139364.550206] ffff8801da60de00 ffff8801da60dde0 0000000000000000 0000000000000000 [139364.550206] 0000000000000000 ffff8801da60df00 ffff8801da5e16f0 0000000000000000 [139364.550206] ffff8801da60deac 000000000009b71a 0000000000000000 000000008112a9e8 [139364.550206] Call Trace: [139364.550206] [] kswapd+0x1c6/0x210 [139364.550206] [] ? mem_cgroup_shrink_node_zone+0x120/0x120 [139364.550206] [] kthread+0x96/0xa0 [139364.550206] [] kernel_thread_helper+0x4/0x10 [139364.550206] [] ? kthread_worker_fn+0x190/0x190 [139364.550206] [] ? gs_change+0x13/0x13 [139364.550206] Code: 1f 84 00 00 00 00 00 48 8b b5 58 ff ff ff 8b bd 54 ff ff ff 48 8d 95 60 ff ff ff e8 97 ec ff ff 48 8b 95 18 ff ff ff 48 8d 7d b0 [139364.550206] c7 02 00 00 00 00 48 8b 95 28 ff ff ff 48 8b b5 60 ff ff ff [139364.550206] RIP [] balance_pgdat+0x6f4/0x8f0 [139364.550206] RSP [139364.550206] CR2: 0000000000000000 [139364.566771] ---[ end trace 3e9719d44e7b4dcc ]---

behlendorf commented 13 years ago

This is likely a duplicate of issue #287

zfsfabien commented 12 years ago

I have tested on better hardware: 6 CPU and 16GB RAM, no L2ARC. 1 mirror pool, 1 raidz pool. ZFS speed is good, the GBps network is the bottleneck. The uptime is very bad, with 2 or 3 nfs feed to 2 ZFS pools it crashes in less than 1 hour:

3.0.0-12-server #20-Ubuntu SMP Fri Oct 7 16:36:30 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux spl-dkms 0.6.0.34 [ 3099.032006] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies) [ 3240.964066] INFO: task fsnotify_mark:68 blocked for more than 120 seconds. [ 3240.964075] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3240.964081] fsnotify_mark D ffffffff81805120 0 68 2 0x00000000 [ 3240.964094] ffff8804124a7c70 0000000000000046 0000000000000000 0000000000000000 [ 3240.964104] ffff8804124a7fd8 ffff8804124a7fd8 ffff8804124a7fd8 0000000000012a40 [ 3240.964114] ffff880417182e40 ffff88041254ae40 0000000000000000 7fffffffffffffff [ 3240.964123] Call Trace: [ 3240.964138] [] schedule_timeout+0x2a5/0x320 [ 3240.964148] [] wait_for_common+0xdf/0x180 [ 3240.964158] [] ? try_to_wake_up+0x200/0x200 [ 3240.964168] [] ? call_rcu_sched+0x20/0x20 [ 3240.964176] [] wait_for_completion+0x1d/0x20 [ 3240.964185] [] synchronize_sched+0x5b/0x60 [ 3240.964193] [] ? find_ge_pid+0x50/0x50 [ 3240.964200] [] synchronize_srcu+0x63/0xc0 [ 3240.964208] [] synchronize_srcu+0x15/0x20 [ 3240.964217] [] fsnotify_mark_destroy+0x95/0x170 [ 3240.964225] [] ? add_wait_queue+0x60/0x60 [ 3240.964233] [] ? fsnotify_put_mark+0x30/0x30 [ 3240.964241] [] kthread+0x8c/0xa0 [ 3240.964250] [] kernel_thread_helper+0x4/0x10 [ 3240.964259] [] ? flush_kthread_worker+0xa0/0xa0 [ 3240.964267] [] ? gs_change+0x13/0x13 [ 3240.964318] INFO: task txg_sync:1739 blocked for more than 120 seconds. [ 3240.964323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3240.964328] txg_sync D 0000000000000003 0 1739 2 0x00000000 [ 3240.964338] ffff8803f1a09ad0 0000000000000046 ffff88042fcd2a40 0000000000000001 [ 3240.964347] ffff8803f1a09fd8 ffff8803f1a09fd8 ffff8803f1a09fd8 0000000000012a40 [ 3240.964356] ffff8804125a9720 ffff8803f7854560 ffff880413a08000 ffffc90012e55228 [ 3240.964365] Call Trace: [ 3240.964373] [] mutex_lock_slowpath+0xd7/0x150 [ 3240.964381] [] ? schedule+0x3d4/0x770 [ 3240.964388] [] mutex_lock+0x22/0x40 [ 3240.964419] [] cv_wait_common+0x7f/0xd0 [spl] [ 3240.964427] [] ? add_wait_queue+0x60/0x60 [ 3240.964448] [] cv_wait+0x13/0x20 [spl] [ 3240.964527] [] zio_wait+0xfb/0x170 [zfs] [ 3240.964576] [] ddt_sync+0x422/0x890 [zfs] [ 3240.964597] [] ? cv_wait_common+0xb7/0xd0 [spl] [ 3240.964660] [] ? zio_destroy+0xa6/0xe0 [zfs] [ 3240.964725] [] spa_sync+0x3e5/0xa00 [zfs] [ 3240.964732] [] ? default_wake_function+0x12/0x20 [ 3240.964740] [] ? autoremove_wake_function+0x16/0x40 [ 3240.964748] [] ? wake_up+0x53/0x70 [ 3240.964756] [] ? _raw_spin_lock+0xe/0x20 [ 3240.964822] [] txg_sync_thread+0x216/0x390 [zfs] [ 3240.964888] [] ? txg_init+0x260/0x260 [zfs] [ 3240.964953] [] ? txg_init+0x260/0x260 [zfs] [ 3240.964972] [] thread_generic_wrapper+0x78/0x90 [spl] [ 3240.964989] [] ? thread_create+0x160/0x160 [spl] [ 3240.964997] [] kthread+0x8c/0xa0 [ 3240.965006] [] kernel_thread_helper+0x4/0x10 [ 3240.965014] [] ? flush_kthread_worker+0xa0/0xa0 [ 3240.965022] [] ? gs_change+0x13/0x13 [ 3360.964067] INFO: task fsnotify_mark:68 blocked for more than 120 seconds. [ 3360.964076] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3360.964083] fsnotify_mark D ffffffff81805120 0 68 2 0x00000000 [ 3360.964095] ffff8804124a7c70 0000000000000046 0000000000000000 0000000000000000 [ 3360.964105] ffff8804124a7fd8 ffff8804124a7fd8 ffff8804124a7fd8 0000000000012a40 [ 3360.964114] ffff880417182e40 ffff88041254ae40 0000000000000000 7fffffffffffffff [ 3360.964123] Call Trace: [ 3360.964138] [] schedule_timeout+0x2a5/0x320 [ 3360.964148] [] wait_for_common+0xdf/0x180 [ 3360.964158] [] ? try_to_wake_up+0x200/0x200 [ 3360.964168] [] ? call_rcu_sched+0x20/0x20 [ 3360.964176] [] wait_for_completion+0x1d/0x20 [ 3360.964185] [] synchronize_sched+0x5b/0x60 [ 3360.964192] [] ? find_ge_pid+0x50/0x50 [ 3360.964200] [] synchronize_srcu+0x63/0xc0 [ 3360.964208] [] synchronize_srcu+0x15/0x20 [ 3360.964217] [] fsnotify_mark_destroy+0x95/0x170 [ 3360.964226] [] ? add_wait_queue+0x60/0x60 [ 3360.964234] [] ? fsnotify_put_mark+0x30/0x30 [ 3360.964241] [] kthread+0x8c/0xa0 [ 3360.964251] [] kernel_thread_helper+0x4/0x10 [ 3360.964260] [] ? flush_kthread_worker+0xa0/0xa0 [ 3360.964268] [] ? gs_change+0x13/0x13 [ 3360.964318] INFO: task txg_sync:1739 blocked for more than 120 seconds. [ 3360.964323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3360.964328] txg_sync D 0000000000000003 0 1739 2 0x00000000 [ 3360.964337] ffff8803f1a09ad0 0000000000000046 ffff88042fcd2a40 0000000000000001 [ 3360.964347] ffff8803f1a09fd8 ffff8803f1a09fd8 ffff8803f1a09fd8 0000000000012a40 [ 3360.964356] ffff8804125a9720 ffff8803f7854560 ffff880413a08000 ffffc90012e55228 [ 3360.964364] Call Trace: [ 3360.964373] [] mutex_lock_slowpath+0xd7/0x150 [ 3360.964380] [] ? schedule+0x3d4/0x770 [ 3360.964387] [] mutex_lock+0x22/0x40 [ 3360.964419] [] cv_wait_common+0x7f/0xd0 [spl] [ 3360.964427] [] ? add_wait_queue+0x60/0x60 [ 3360.964447] [] cv_wait+0x13/0x20 [spl] [ 3360.964526] [] zio_wait+0xfb/0x170 [zfs] [ 3360.964575] [] ddt_sync+0x422/0x890 [zfs] [ 3360.964595] [] ? cv_wait_common+0xb7/0xd0 [spl] [ 3360.964658] [] ? zio_destroy+0xa6/0xe0 [zfs] [ 3360.964723] [] spa_sync+0x3e5/0xa00 [zfs] [ 3360.964730] [] ? default_wake_function+0x12/0x20 [ 3360.964738] [] ? autoremove_wake_function+0x16/0x40 [ 3360.964746] [] ? wake_up+0x53/0x70 [ 3360.964754] [] ? _raw_spin_lock+0xe/0x20 [ 3360.964820] [] txg_sync_thread+0x216/0x390 [zfs] [ 3360.964886] [] ? txg_init+0x260/0x260 [zfs] [ 3360.964951] [] ? txg_init+0x260/0x260 [zfs] [ 3360.964970] [] thread_generic_wrapper+0x78/0x90 [spl] [ 3360.964987] [] ? thread_create+0x160/0x160 [spl] [ 3360.964995] [] kthread+0x8c/0xa0 [ 3360.965004] [] kernel_thread_helper+0x4/0x10 [ 3360.965012] [] ? flush_kthread_worker+0xa0/0xa0 [ 3360.965020] [] ? gs_change+0x13/0x13 [ 3360.965065] INFO: task txg_sync:2445 blocked for more than 120 seconds. [ 3360.965070] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3360.965075] txg_sync D 0000000000000001 0 2445 2 0x00000000 [ 3360.965084] ffff8803eff95c80 0000000000000046 ffff8803eff95c20 ffffffff810329a9 [ 3360.965093] ffff8803eff95fd8 ffff8803eff95fd8 ffff8803eff95fd8 0000000000012a40 [ 3360.965102] ffff8803f0414560 ffff8803f0570000 ffff8803f06c8040 ffffc9001c0478b0 [ 3360.965110] Call Trace: [ 3360.965118] [] ? default_spin_lock_flags+0x9/0x10 [ 3360.965138] [] cv_wait_common+0x77/0xd0 [spl] [ 3360.965146] [] ? add_wait_queue+0x60/0x60 [ 3360.965165] [] cv_wait+0x13/0x20 [spl] [ 3360.965228] [] zio_wait+0xfb/0x170 [zfs] [ 3360.965291] [] spa_sync+0x252/0xa00 [zfs] [ 3360.965299] [] ? default_wake_function+0x12/0x20 [ 3360.965307] [] ? autoremove_wake_function+0x16/0x40 [ 3360.965316] [] ? wake_up_common+0x58/0x90 [ 3360.965323] [] ? wake_up+0x53/0x70 [ 3360.965389] [] txg_sync_thread+0x216/0x390 [zfs] [ 3360.965455] [] ? txg_init+0x260/0x260 [zfs] [ 3360.965520] [] ? txg_init+0x260/0x260 [zfs] [ 3360.965538] [] thread_generic_wrapper+0x78/0x90 [spl] [ 3360.965555] [] ? thread_create+0x160/0x160 [spl] [ 3360.965563] [] kthread+0x8c/0xa0 [ 3360.965572] [] kernel_thread_helper+0x4/0x10 [ 3360.965580] [] ? flush_kthread_worker+0xa0/0xa0 [ 3360.965588] [] ? gs_change+0x13/0x13 [ 3360.965597] INFO: task rsync:3013 blocked for more than 120 seconds. [ 3360.965602] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3360.965607] rsync D ffffffff81805120 0 3013 3012 0x00000000 [ 3360.965616] ffff8803f35afb98 0000000000000086 ffff8803f35afb38 ffffffff810329a9 [ 3360.965624] ffff8803f35affd8 ffff8803f35affd8 ffff8803f35affd8 0000000000012a40 [ 3360.965633] ffffffff81c0b020 ffff8804171a4560 ffff8803f35afbb8 ffff8803f8b03b30 [ 3360.965641] Call Trace: [ 3360.965648] [] ? default_spin_lock_flags+0x9/0x10 [ 3360.965668] [] cv_wait_common+0x77/0xd0 [spl] [ 3360.965676] [] ? add_wait_queue+0x60/0x60 [ 3360.965694] [] __cv_wait+0x13/0x20 [spl] [ 3360.965760] [] txg_wait_open+0x73/0xa0 [zfs] [ 3360.965813] [] dmu_tx_wait+0xed/0xf0 [zfs] [ 3360.965877] [] zfs_write+0x377/0xc50 [zfs] [ 3360.965942] [] zpl_write_common+0x52/0x80 [zfs] [ 3360.966004] [] zpl_write+0x68/0xa0 [zfs] [ 3360.966014] [] vfs_write+0xb3/0x180 [ 3360.966023] [] sys_write+0x4a/0x90 [ 3360.966031] [] system_call_fastpath+0x16/0x1b [ 3360.966038] INFO: task rsync:3112 blocked for more than 120 seconds. [ 3360.966042] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3360.966047] rsync D ffffffff81805120 0 3112 1 0x00000004 [ 3360.966056] ffff880125eb9b98 0000000000000082 ffff880125eb9b38 ffffffff810329a9 [ 3360.966064] ffff880125eb9fd8 ffff880125eb9fd8 ffff880125eb9fd8 0000000000012a40 [ 3360.966073] ffff880417128000 ffff8803f0414560 ffff880125eb9bb8 ffff8804127ab330 [ 3360.966081] Call Trace: [ 3360.966088] [] ? default_spin_lock_flags+0x9/0x10 [ 3360.966108] [] cv_wait_common+0x77/0xd0 [spl] [ 3360.966116] [] ? add_wait_queue+0x60/0x60 [ 3360.966135] [] __cv_wait+0x13/0x20 [spl] [ 3360.966200] [] txg_wait_open+0x73/0xa0 [zfs] [ 3360.966253] [] dmu_tx_wait+0xed/0xf0 [zfs] [ 3360.966317] [] zfs_write+0x377/0xc50 [zfs] [ 3360.966327] [] ? sock_update_classid+0x30/0x50 [ 3360.966336] [] ? core_sys_select+0x2db/0x330 [ 3360.966345] [] ? sock_aio_read+0x2d/0x40 [ 3360.966353] [] ? do_sync_read+0xd2/0x110 [ 3360.966415] [] zpl_write_common+0x52/0x80 [zfs] [ 3360.966478] [] zpl_write+0x68/0xa0 [zfs] [ 3360.966487] [] vfs_write+0xb3/0x180 [ 3360.966495] [] sys_write+0x4a/0x90 [ 3360.966503] [] system_call_fastpath+0x16/0x1b [ 3459.272006] INFO: rcu_sched_state detected stall on CPU 0 (t=105060 jiffies) [ 3639.392006] INFO: rcu_sched_state detected stall on CPU 0 (t=150090 jiffies)

Let me know how I can help.

gunnarbeutner commented 12 years ago

Does your system still work after this happens (except for the balance_pgdat panic of course)? If it does it would be interesting to see the output of "zpool events -v" once that zio_wait deadlock has occured.

behlendorf commented 12 years ago

Closing issue, this should be fixed by the following two commits for #287.

zfsonlinux/zfs@6a95d0b74c2951f0dc82361ea279f64a7349f060 zfsonlinux/spl@b8b6e4c453929596b630fa1cca1ee26a532a2ab4