Open arnoldas500 opened 1 year ago
Hi, I'm sorry that I don't have a good idea about getting GPU accounting information from Slurm :-( Best regards, Ole
What about the following command (at least for GPUs):
sreport -tminper cluster utilization --tres="gres/gpu" start=2023-03-01T00:00:00
Output shows something like:
--------------------------------------------------------------------------------
Cluster TRES Name Allocated Down PLND Down Idle Planned Reported
--------- -------------- ------------------ ------------------ ----------------- ------------------ ----------------- -------------------
myCluster gres/gpu 14591077(57.06%) 2282656(8.93%) 0(0.00%) 8699467(34.02%) 0(0.00%) 25573200(100.00%)
Combining CPU and GPU usage in one report may be possible but I am not sure if the numbers will be 'mixed up' too much.
The issue with the above report is that I cannot separate by partition or by node. I have wrote my own reporting tool to calculate GPU hours per node and per partition.
OleHolmNielsen, You've written some great utilities, and provided some excellent info to the slurm-users mailing list. Thanks! The one thing sreport does that slurmacct doesn't, is allow itself to be run as a non-root user, as long as the user has the admin role in the slurm db. Have you any suggestions for running slurmacct as a non-root user?
Hi, thanks for your nice comments! The slurmacct script actually uses the Slurm commands sreport and sacct to generate reports. How did you find that non-root users aren't allowed to use slurmacct? Please first make sure that the sreport and sacct commands are permitted for your non-root user.
Thanks for your reply! I saw that in the script, and was puzzled, because these guys could run sreport (and friends) without issues. The helpful message from the OS was "permission denied".
I never figured out why, but I got it working by throwing the users into the slurm group and granting rights to execute it in sudoers.
Got there the long way 'round, but at least I didn't (as one user suggested) resort to setuid! Thanks again for sharing your hard work and wisdom.
Hi,
I was wondering if you have a flag to get the total cpu and gpu usage with the slurmacct tool? Goal is to get the total cpu and gpu hours per month per partition.
Thank you