linux-nfs / nfsd

Linux kernel source tree
Other
0 stars 0 forks source link

NFS server monitoring #20

Open chucklever opened 8 months ago

chucklever commented 8 months ago

This was bugzilla.linux-nfs.org 366

Description

[Chuck Lever 2021-08-09 23:12:23 UTC] Many modern NFS clients rely on the server to drop the connection when the server knows it won't be able to reply at all to an RPC, thanks to a compliance requirement in RFC 5661. Unfortunately, it's difficult to guarantee that all possible ways a server can lose a request have been properly identified.

Toward that end, we'd like to introduce a mechanism for watching for RPC Calls that are received but never replied to. The mechanism can either log the XID, or it could even proactively drop the connection.

Comment 1

[Chuck Lever 2021-08-11 20:06:28 UTC] Another valuable piece of observability would be to have a /proc or /sys interface that shows a snapshot of the current RPCs that are being processed and their status (eg, XID, client network address, an instruction pointer or stack dump, maybe any open files, whether an upcall or callback is being waited on, and so on). Iterative snapshots that show the same XID tuple with no change in status information would be clear indication of a lost or dropped RPC.

A similar interface might be made available to observe the DRC, though that would be interesting only for NFS versions that do not support NFSv4 sessions.

Comment 2

[Jeff Layton 2023-03-09 15:27:00 UTC] One idea:

Keep all of the incoming svc_rqsts on a list and then just dequeue them when sending the reply. Then add a /proc/fs/nfsd file that shows a list of the ones in flight.

Comment 3

[Jeff Layton 2023-04-07 13:53:34 UTC] Changing the title of this bug to better reflect what we intend to build.

Comment 4

[Jeff Layton 2023-04-07 14:03:25 UTC] To be clear:

Each nfsd thread has an associated svc_rqst structure. The idea here would be to queue each svc_rqst to a list in the nfsd_net structure. When a new RPC comes in, we'd queue it to the list, and dequeue it after sending the reply.

When someone reads the seqfile for this, we'd walk the list and print info about what each svc_rqst is doing. That info can start out with pretty basic info readily available in the svc_rqst:

rq_addr rq_daddr deferral info xid rq_prog/vers/proc rq_flags

...etc.

Maybe we can eventually expand it to do stuff like print info on a v4 compound being processed too, but just building the basic interface would be a good start.

Comment 5

[Chuck Lever 2023-04-09 17:02:14 UTC]

Each nfsd thread has an associated svc_rqst structure. The idea here would be to queue each svc_rqst to a list in the nfsd_net structure. When a new RPC comes in, we'd queue it to the list, and dequeue it after sending the reply.

When someone reads the seqfile for this, we'd walk the list and print info about what each svc_rqst is doing. That info can start out with pretty basic info readily available in the svc_rqst:

Instead of adding and removing from a list, why not walk through all of the nfsd threads and display only the ones marked with RQ_BUSY?

rq_addr rq_daddr deferral info xid rq_prog/vers/proc rq_flags

...etc.

rq_stime is another candidate.

Comment 6

[Jeff Layton 2023-04-14 14:09:45 UTC] Ahh yeah. No need for a new list. Some notes:

The basic idea here is to create a new seqfile in nfsdfs (maybe named "status" or "monitor").

The nfs server is "namespaced", such that there are different server instances in each network namespace. The nfsd_net structure is the top level tracking struct for this. Each nfsd_net file has its own nfsdfs instance, and a "nfsd_serv" pointer which points to the top-level svc_serv struct. From there you'll need to drill down into the sv pools array and there should be the sp_all_threads list.

Walk that list (taking appropriate locks) and then emit some information to the new seqfile.

Comment 7

[Lorenzo Bianconi 2023-05-19 23:02:32 UTC]

Ahh yeah. No need for a new list. Some notes:

The basic idea here is to create a new seqfile in nfsdfs (maybe named "status" or "monitor").

The nfs server is "namespaced", such that there are different server instances in each network namespace. The nfsd_net structure is the top level tracking struct for this. Each nfsd_net file has its own nfsdfs instance, and a "nfsd_serv" pointer which points to the top-level svc_serv struct. From there you'll need to drill down into the sv pools array and there should be the sp_all_threads list.

Walk that list (taking appropriate locks) and then emit some information to the new seqfile.

Hi Jeff and Chuck,

discussing with Jeff, I started working on this bugzilla. So far I added the rpc_status entry in the nfsd debugfs dumping some svc_rqst info if the rpc request is busy (the code is available here [0]). I would say we can define what are the info we are interested in, what do you think?

0

Comment 8

[Jeff Layton 2023-06-05 15:27:06 UTC] That looks like a great start, Lorenzo. That will tell us something about the clients and the low-level RPC info. It'd probably be a good idea to go ahead and post this to the linux-nfs mailing list as an RFC patch to get more feedback as to what else would be good to add.

As a second step, if it's a version 4 COMPOUND request, it would be nice to have it walk over the compound and show all of the operations in it. See nfsd4_proc_compound() for info. This will probably mean that each entry will need to be more than one line, but that's OK.

It might also be nice to have the rq_proc and NFSv4 operations displayed symbolically rather than as numbers.

Comment 9

[Lorenzo Bianconi 2023-06-19 14:25:10 UTC]

That looks like a great start, Lorenzo. That will tell us something about the clients and the low-level RPC info. It'd probably be a good idea to go ahead and post this to the linux-nfs mailing list as an RFC patch to get more feedback as to what else would be good to add.

Hi Jeff,

Thx for the review. ack, fine. I will post it.

As a second step, if it's a version 4 COMPOUND request, it would be nice to have it walk over the compound and show all of the operations in it. See nfsd4_proc_compound() for info. This will probably mean that each entry will need to be more than one line, but that's OK.

It might also be nice to have the rq_proc and NFSv4 operations displayed symbolically rather than as numbers.

ack, I addressed your comments here: https://github.com/LorenzoBianconi/nfsd-next/commit/5df679730c3c35ba70521806c6024205a96207d7

chucklever commented 8 months ago

[Chuck Lever 2021-08-11 20:06:28 UTC] Another valuable piece of observability would be to have a /proc or /sys interface that shows a snapshot of the current RPCs that are being processed and their status (eg, XID, client network address, an instruction pointer or stack dump, maybe any open files, whether an upcall or callback is being waited on, and so on). Iterative snapshots that show the same XID tuple with no change in status information would be clear indication of a lost or dropped RPC.

A similar interface might be made available to observe the DRC, though that would be interesting only for NFS versions that do not support NFSv4 sessions.

chucklever commented 8 months ago

[Jeff Layton 2023-03-09 15:27:00 UTC] One idea:

Keep all of the incoming svc_rqsts on a list and then just dequeue them when sending the reply. Then add a /proc/fs/nfsd file that shows a list of the ones in flight.

chucklever commented 8 months ago

[Jeff Layton 2023-04-07 13:53:34 UTC] Changing the title of this bug to better reflect what we intend to build.

chucklever commented 8 months ago

[Jeff Layton 2023-04-07 14:03:25 UTC] To be clear:

Each nfsd thread has an associated svc_rqst structure. The idea here would be to queue each svc_rqst to a list in the nfsd_net structure. When a new RPC comes in, we'd queue it to the list, and dequeue it after sending the reply.

When someone reads the seqfile for this, we'd walk the list and print info about what each svc_rqst is doing. That info can start out with pretty basic info readily available in the svc_rqst:

rq_addr rq_daddr deferral info xid rq_prog/vers/proc rq_flags

...etc.

Maybe we can eventually expand it to do stuff like print info on a v4 compound being processed too, but just building the basic interface would be a good start.

chucklever commented 8 months ago

[Chuck Lever 2023-04-09 17:02:14 UTC]

Each nfsd thread has an associated svc_rqst structure. The idea here would be to queue each svc_rqst to a list in the nfsd_net structure. When a new RPC comes in, we'd queue it to the list, and dequeue it after sending the reply.

When someone reads the seqfile for this, we'd walk the list and print info about what each svc_rqst is doing. That info can start out with pretty basic info readily available in the svc_rqst:

Instead of adding and removing from a list, why not walk through all of the nfsd threads and display only the ones marked with RQ_BUSY?

rq_addr rq_daddr deferral info xid rq_prog/vers/proc rq_flags

...etc.

rq_stime is another candidate.

chucklever commented 8 months ago

[Jeff Layton 2023-04-14 14:09:45 UTC] Ahh yeah. No need for a new list. Some notes:

The basic idea here is to create a new seqfile in nfsdfs (maybe named "status" or "monitor").

The nfs server is "namespaced", such that there are different server instances in each network namespace. The nfsd_net structure is the top level tracking struct for this. Each nfsd_net file has its own nfsdfs instance, and a "nfsd_serv" pointer which points to the top-level svc_serv struct. From there you'll need to drill down into the sv pools array and there should be the sp_all_threads list.

Walk that list (taking appropriate locks) and then emit some information to the new seqfile.

chucklever commented 8 months ago

[Lorenzo Bianconi 2023-05-19 23:02:32 UTC]

Ahh yeah. No need for a new list. Some notes:

The basic idea here is to create a new seqfile in nfsdfs (maybe named "status" or "monitor").

The nfs server is "namespaced", such that there are different server instances in each network namespace. The nfsd_net structure is the top level tracking struct for this. Each nfsd_net file has its own nfsdfs instance, and a "nfsd_serv" pointer which points to the top-level svc_serv struct. From there you'll need to drill down into the sv pools array and there should be the sp_all_threads list.

Walk that list (taking appropriate locks) and then emit some information to the new seqfile.

Hi Jeff and Chuck,

discussing with Jeff, I started working on this bugzilla. So far I added the rpc_status entry in the nfsd debugfs dumping some svc_rqst info if the rpc request is busy (the code is available here [0]). I would say we can define what are the info we are interested in, what do you think?

0

chucklever commented 8 months ago

[Jeff Layton 2023-06-05 15:27:06 UTC] That looks like a great start, Lorenzo. That will tell us something about the clients and the low-level RPC info. It'd probably be a good idea to go ahead and post this to the linux-nfs mailing list as an RFC patch to get more feedback as to what else would be good to add.

As a second step, if it's a version 4 COMPOUND request, it would be nice to have it walk over the compound and show all of the operations in it. See nfsd4_proc_compound() for info. This will probably mean that each entry will need to be more than one line, but that's OK.

It might also be nice to have the rq_proc and NFSv4 operations displayed symbolically rather than as numbers.

chucklever commented 8 months ago

[Lorenzo Bianconi 2023-06-19 14:25:10 UTC]

That looks like a great start, Lorenzo. That will tell us something about the clients and the low-level RPC info. It'd probably be a good idea to go ahead and post this to the linux-nfs mailing list as an RFC patch to get more feedback as to what else would be good to add.

Hi Jeff,

Thx for the review. ack, fine. I will post it.

As a second step, if it's a version 4 COMPOUND request, it would be nice to have it walk over the compound and show all of the operations in it. See nfsd4_proc_compound() for info. This will probably mean that each entry will need to be more than one line, but that's OK.

It might also be nice to have the rq_proc and NFSv4 operations displayed symbolically rather than as numbers.

ack, I addressed your comments here: https://github.com/LorenzoBianconi/nfsd-next/commit/5df679730c3c35ba70521806c6024205a96207d7