As part of starting an MDT Lustre will enter the WAITING state and attempt to contact all MDTs. This can take a considerable amount of time when multiple MDTs are unavailable. Which MDTs are being waited on is available in the recovery_status file, it would be helpful to print this in ltop rather than 0s remaining which is what's currently output. Something like this for example.
0000 server1 data stale
0001 server2 WAITING on MDTs 0000 0002
0002 server3 data stale
0003 server4 WAITING on MDTs 0000 0002
As part of starting an MDT Lustre will enter the WAITING state and attempt to contact all MDTs. This can take a considerable amount of time when multiple MDTs are unavailable. Which MDTs are being waited on is available in the
recovery_status
file, it would be helpful to print this inltop
rather than0s remaining
which is what's currently output. Something like this for example.Example recovery_status file for MDT0001