Closed chucklever closed 2 months ago
Jeff Layton says:
That may be, but I don't think we can rely on that from the server-side. I know that Neil and Chuck were discussing spinning up more nfsd threads dynamically as required. That may be the best fix for this sort of problem.
Until then, the guidance should probably be to start a lot of threads if you're running a pNFS SCSI enabled nfsd server.
But what can knfsd do to defend itself?
If nfsd threads are waiting indefinitely, that's a potential DoS vector. Ideally the thread should preserve the waiting request somehow (or return NFS4ERR_DELAY, maybe?). At some later point when the lease conflict is resolved, the requests can be reprocessed.
That's my naive 800,000 ft view.
Jeff Layton says: One thing that might help would be make the nfsd threadpool more dynamic. If it could just spin up another thread, we'd be OK.
Maybe the server could keep an emergency thread around that is just for processing LAYOUTRETURN/DELEGRETURN (maybe also CLOSE, etc.) calls? Sometimes these ops are mixed into compounds with other sorts of calls though, so we'd need to deal with that somehow.
If nfsd threads are waiting indefinitely, that's a potential DoS vector. Ideally the thread should preserve the waiting request somehow (or return NFS4ERR_DELAY, maybe?). At some later point when the lease conflict is resolved, the requests can be reprocessed.
That's my naive 800,000 ft view.
Yeah, that would probably be ideal. xfs_break_leased_layouts is pretty heavy handed right now. It currently does this:
while ((error = break_layout(inode, false)) == -EWOULDBLOCK) {
xfs_iunlock(ip, *iolock);
*did_unlock = true;
error = break_layout(inode, true);
*iolock &= ~XFS_IOLOCK_SHARED;
*iolock |= XFS_IOLOCK_EXCL;
xfs_ilock(ip, *iolock);
}
So you can always get stuck in that inner break_layout call. We could try to use IOCB_NOWAIT for I/Os coming from nfsd, which would prevent that, but that seems like it could change the I/O behavior in other ways we don't want. It's not clear to me how that would work alongside IOCB_SYNC anyway.
It's not immediately clear to me how we'd do this with the existing IOCB_* flags, so we might need a new one (IOCB_NFSD?). Then, we could just make sure that break_layout call is always nonblocking if that flag is set.
Christoph Hellwig says:
On Thu, Nov 16, 2023 at 03:32:34PM -0500, Jeff Layton wrote: One thing that might help would be make the nfsd threadpool more dynamic. If it could just spin up another thread, we'd be OK.
Maybe the server could keep an emergency thread around that is just for processing LAYOUTRETURN/DELEGRETURN (maybe also CLOSE, etc.) calls? Sometimes these ops are mixed into compounds with other sorts of calls though, so we'd need to deal with that somehow.
I think having an extra thread just for LAYOUTRETURN/DELEGRETURN is fundamentally the right thing to do, as they need to be processed to allow everyone else to progress.
If nfsd threads are waiting indefinitely, that's a potential DoS vector. Ideally the thread should preserve the waiting request somehow (or return NFS4ERR_DELAY, maybe?). At some later point when the lease conflict is resolved, the requests can be reprocessed.
That's my naive 800,000 ft view.
That is probably a good idea on top of the above.
So you can always get stuck in that inner break_layout call. We could try to use IOCB_NOWAIT for I/Os coming from nfsd, which would prevent that, but that seems like it could change the I/O behavior in other ways we don't want. It's not clear to me how that would work alongside IOCB_SYNC anyway.
So I definitively think using IOCB_NOWAIT from nfsd and not block the thread for trivial I/O is a good thing. This might even enable not offloading I/O to threads until the IOCB_NOWAIT failed. But any IOCB_NOWAIT that returned -EAGAIN needs to eventually fall back to doing blocking I/O as there is no progress guarantee for IOCB_NOWAIT, and some I/O simply is entirely impossible with IOCB_NOWAIT.
It's not immediately clear to me how we'd do this with the existing IOCB_* flags, so we might need a new one (IOCB_NFSD?). Then, we could just make sure that break_layout call is always nonblocking if that flag is set.
I'd name it about the behavior that it controls and not the callers, but otherwise this too seems like a good idea.
Another thought from Jeff: When we get down to one unblocked nfsd (or past a high water mark), only allow compound operations that release state. Everything else gets NFS4ERR_DELAY.
This is a duplicate of #71, which is more specific, so close this one.
Ben Coddington reports: