Open DmitryLitvintsev opened 2 years ago
Capturing the NFS traffic between the client and server (using a tool like tcpdump) is a good bet. This might (or might not) provide some additional insight into what's going wrong.
You could also enable debug logging on the org.dcache.nfs.v4.OperationREADDIR
logger in the NFS door. The admin command should be:
log set stdout org.dcache.nfs.v4.OperationREADDIR DEBUG
OK, quick summary: the client seems to be scanning through the directory response and requesting file attributes (GETATTR) for the contents of the directory. AFAIK, this is normal behaviour for an NFS client.
Each request is a compound request but that, too, is normal (AFAIK).
The responses seems OK (to my eyes), but the client keeps asking about just two directory items, cycling through them. The client asks about FileHandle-A, then FileHandle-B, then FileHandle-A, then FileHandle-B, and so on.
Ignoring the possibility of this directory having a large number of hard-links of the same two files (very unlikely), it looks like either the client is caught in a loop or dCache's directly listing went mad.
If dCache's directory listing was at fault (caught in a loop) then why would that loop terminate, unless the client has some internal limit after which it just ignores further directory items.
I think we need @kofemann's expertise to go any further here.
dCache 7.2 client : Scientific Linux release 7.9 (Nitrogen)
Hangs.
ls on parent works:
Anything short of rebooting client machine ? (killing client from admin shell does not help)
Enabling NFS debug does not give any useful info (to me):