Closed chjohnst closed 7 years ago
IMO we can close this issue because we had discussed it over nfs-ganesha project https://github.com/nfs-ganesha/nfs-ganesha/issues/148. He didn't give reply after my last comment on March 23rd. Hence closing this issue as works for me
I posted a ticket on the ganesha github portal where I have a 6 node distributed replica (sharded) in a 3x2 configuration. I have tried ganesha 2.3.3, 2.4.1 and 2.4.3 with little luck. Essentially my clients when they are doing heavily threaded reads the client will hang and eventually stack traces come out from the D state procs. Anyone seen these traces before like this? My client OS is CentOS 7.3 latest and greatest kernels.
https://github.com/nfs-ganesha/nfs-ganesha/issues/148
Mar 1 12:55:55 dev-gc01-vm507 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 1 12:55:55 dev-gc01-vm507 kernel: qbuckets2 D ffffffffa0476de8 0 2950 2920 0x00000080 Mar 1 12:55:55 dev-gc01-vm507 kernel: ffff8800b9697600 0000000000000086 ffff880fdd9e3ec0 ffff8800b9697fd8 Mar 1 12:55:55 dev-gc01-vm507 kernel: ffff8800b9697fd8 ffff8800b9697fd8 ffff880fdd9e3ec0 ffffffffa0476de0 Mar 1 12:55:55 dev-gc01-vm507 kernel: ffffffffa0476de4 ffff880fdd9e3ec0 00000000ffffffff ffffffffa0476de8 Mar 1 12:55:55 dev-gc01-vm507 kernel: Call Trace: Mar 1 12:55:55 dev-gc01-vm507 kernel: [] schedule_preempt_disabled+0x29/0x70
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] mutex_lock_slowpath+0xc5/0x1c0
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? _raw_spin_unlock_bh+0x1b/0x40
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] mutex_lock+0x1f/0x2f
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs4_discover_server_trunking+0x48/0x2e0 [nfsv4]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs4_init_client+0x124/0x2f0 [nfsv4]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? kmem_cache_alloc+0x193/0x1e0
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? fscache_acquire_cookie+0x66/0x180 [fscache]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? fscache_acquire_cookie+0x66/0x180 [fscache]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? nfs4_alloc_client+0x199/0x1f0 [nfsv4]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs_get_client+0x22a/0x390 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs4_set_ds_client+0xfa/0x130 [nfsv4]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? nfs_readhdr_alloc+0x1a/0x20 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs4_pnfs_ds_connect+0x1d8/0x410 [nfsv4]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs4_fl_prepare_ds+0xa4/0xc8 [nfs_layout_nfsv41_files]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] filelayout_read_pagelist+0x56/0x1a0 [nfs_layout_nfsv41_files]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] pnfs_generic_pg_readpages+0xa4/0x1d0 [nfsv4]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs_pageio_doio+0x27/0x60 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs_pageio_add_request+0xb7/0x450 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? nfs_get_lock_context+0x4f/0x120 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs_pageio_add_request+0xc2/0x2a0 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] readpage_async_filler+0xeb/0x1b0 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] read_cache_pages+0x9d/0xe0
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? nfs_return_empty_page+0x70/0x70 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] nfs_readpages+0x155/0x1f0 [nfs]
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] __do_page_cache_readahead+0x1cc/0x250
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ra_submit+0x21/0x30
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] filemap_fault+0x105/0x410
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] do_fault+0x4c/0xc0
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] ? alloc_pages_current+0xaa/0x170
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] do_read_fault.isra.42+0x43/0x130
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] handle_mm_fault+0x6b1/0xfe0
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] __do_page_fault+0x154/0x450
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] do_page_fault+0x35/0x90
Mar 1 12:55:55 dev-gc01-vm507 kernel: [] page_fault+0x28/0x30