WIMM-IT / slurm-analytics

Analyse Slurm sacct data with Python Pandas
GNU General Public License v3.0
0 stars 0 forks source link

Issue with node names? #1

Closed verdurin closed 5 hours ago

verdurin commented 1 week ago

I see this traceback which seems to relate to our nodenames, which have the form:

comp<type><number>

e.g.

compa000

Here's the traceback in dump-cluster-resources.py:

(slurm-analytics) [crm194@cluster1 slurm-analytics]$ python dump-cluster-resources.py
Traceback (most recent call last):
  File "/gpfs3/users/rescomp/crm194/src/github/slurm-analytics/dump-cluster-resources.py", line 153, in <module>
    partitions = prune_nodes(partitions, nodes)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gpfs3/users/rescomp/crm194/src/github/slurm-analytics/dump-cluster-resources.py", line 104, in prune_nodes
    node_cpus = nodes[node]['CPUs']
                ~~~~~^^^^^^
KeyError: 'compa0'
aowenson-imm commented 11 hours ago

Can you share the output of:

scontrol show partitions | grep " Nodes="

If not, can you cross-check that output against function expand_node_ranges for incorrect assumptions?

verdurin commented 10 hours ago

Here you go:

Nodes=compa[000-044],compe[002-022,027-036,038-064,069-094],compf[000-007,009-015]
Nodes=compa[000-044],compe[002-022,027-036,038-064,069-094],compf[000-007,009-015]
Nodes=compe[023-026,037,065-068],compf008
Nodes=compe[023-026,037,065-068],compf008
Nodes=comph[003-005]
Nodes=compa[045-047],compe[095-101]
Nodes=epyc[000,002-003]
Nodes=brienne,gromit,jeeves
Nodes=humbug
Nodes=win[002-004]
Nodes=compg[009-011,013,016,028-032,035-036,039-042]
Nodes=compg[009-011,016,028-029,031-032,035-036,039-041]
Nodes=compg[026-027,038]
Nodes=compg034
Nodes=compg[027,037-038]
Nodes=compg042
Nodes=compg[021,043-046]
Nodes=compg[033-034]
aowenson-imm commented 6 hours ago

Should be fixed now.

verdurin commented 5 hours ago

Yes, fixed - thanks.