Closed Comeani closed 2 years ago
I'm not sure how to reproduce the situation where an output line from sinfo shows as a list of nodes (gpu-n[19,20]), but the formatting for the sinfo output is different in #14 and may prevent this from being an issue altogether. It also looks like that code reports more information, although I still need to check that it's accurate.
[nlc60@login0b wrappers] issue/6 : ./crc-idle.py -g (crc-idle from v0.1.0 of the wrapper scripts).
Cluster: gpu, Partition: gtx1080
================================
10 nodes w/ 4 idle GPUs
5 nodes w/ 8 idle GPUs
1 nodes w/ 12 idle GPUs
Cluster: gpu, Partition: titanx
===============================
6 nodes w/ 8 idle GPUs
1 nodes w/ 9 idle GPUs
Cluster: gpu, Partition: k40
============================
1 nodes w/ 20 idle GPUs
Cluster: gpu, Partition: v100
=============================
1 nodes w/ 22 idle GPUs
[nlc60@login0b wrappers] issue/6 : crc-idle.py -g (current crc-idle)
Cluster: gpu, Partition: gtx1080
================================
1 nodes w/ 4 idle GPUs
Cluster: gpu, Partition: titanx
===============================
1 nodes w/ 1 idle GPUs
Cluster: gpu, Partition: k40
============================
1 nodes w/ 2 idle GPUs
Cluster: gpu, Partition: v100
=============================
1 nodes w/ 2 idle GPUs
I think there may be some code in the original version of the application to handle this.
The difference in reported information may have to do with an attempt to fix #6
https://github.com/pitt-crc/wrappers/blob/5f3c8e336f8157a72af4fb64af1a5633317df5c6/crc-idle.py#L57-L73
When attempting to build a list of used GPU resources from the output of squeue, it's possible for there to be multiple nodes used by a single job. The code that attempts to handle breaks adds the incorrect value to the dictionary.