The built-in Slurm command sdiag is useful for an overview of the Slurm control daemon (slurmctld). It provides cumulative statistics on the state of slurmctld since it was last reset. Values are reset automatically at midnight UTC daily, and can be --reset explicitly.
The command sdiag | head -n 18 gives a summary of jobs on the cluster and in the queue.
The block Main schedule statistics shows info on scheduling cycles.
The block Backfilling stats may be more useful when we start using backfill queues.
The next blocks are about remote procedure calls and may be useful for understanding how Slurm is being used in general. It may prove useful in investigating inefficiencies from researchers unfamiliar with how Slurm works and its limitations. An example might be identifying researchers making large numbers of remote procedure calls with, e.g., squeue in a loop.
The built-in Slurm command
sdiag
is useful for an overview of the Slurm control daemon (slurmctld
). It provides cumulative statistics on the state ofslurmctld
since it was last reset. Values are reset automatically at midnight UTC daily, and can be--reset
explicitly.The command
sdiag | head -n 18
gives a summary of jobs on the cluster and in the queue.The block
Main schedule statistics
shows info on scheduling cycles.The block
Backfilling stats
may be more useful when we start using backfill queues.The next blocks are about remote procedure calls and may be useful for understanding how Slurm is being used in general. It may prove useful in investigating inefficiencies from researchers unfamiliar with how Slurm works and its limitations. An example might be identifying researchers making large numbers of remote procedure calls with, e.g.,
squeue
in a loop.