ubccr / coldfront

HPC Resource Allocation System
https://coldfront.readthedocs.io
GNU General Public License v3.0
101 stars 80 forks source link

slurm_check letter case issue in slurm plugin #511

Open rsandatpsu opened 1 year ago

rsandatpsu commented 1 year ago

I found that if the slurm_account_name in coldfront has uppercase letters in it the slurm_check program incorrectly identifies all accounts in coldfront that have the upper case letters as not in the slurm database. I did use the slurm_dump program to load the accounts into slurm and associate the account with the userid. I didn't check to see if the slurm_dump program converted all the uppercase letters to lowercase but they did get put into the slurm db with lowercase letters.

dsajdak commented 1 year ago

Can confirm if I set the slurm_account attribute to all caps - in this example TEST - the coldfront slurm_dump command keeps it all caps but the sacctmgr load command converts to all lowercase. When we use coldfront slurm_check to look for expired allocations for the TEST account, nothing is found because there is an active allocation with the slurm_account=TEST attribute set. When we look for the test account it wants to remove the account and its associations because there is no active allocation with the slurm_account=test attribute set.

(venv) [hpcadmin@coldfront www]$ coldfront slurm_dump -c hpc -o ~/slurm_dump/
Writing output to directory: /home/hpcadmin/slurm_dump/
(venv) [hpcadmin@coldfront www]$ cat ~/slurm_dump/hpc.cfg
# ColdFront Allocation Slurm associations dump 2023-03-29
Cluster - 'hpc':
Parent - 'root'
User - 'root':DefaultAccount='root':AdminLevel='Administrator':Fairshare=1
Account - 'TEST':Fairshare=100
Parent - 'TEST'
User - 'cgray':
User - 'csimmons':
(venv) [hpcadmin@coldfront www]$ sacctmgr load file=~/slurm_dump/hpc.cfg
For cluster hpc
Accounts
      Name                Descr                  Org                  QOS
---------- -------------------- -------------------- --------------------
      test                 test                 test
---------------------------------------------------

Account Associations
   Account ParentName     Share   GrpTRESMins GrpTRESRunMin       GrpTRES GrpJobs GrpJobsAccrue  GrpMem GrpNodes GrpSubmit     GrpWall   MaxTRESMins       MaxTRES MaxTRESPerNode MaxJobs MaxSubmit MaxNodes     MaxWall                  QOS   Def QOS
---------- ---------- --------- ------------- ------------- ------------- ------- ------------- ------- -------- --------- ----------- ------------- ------------- -------------- ------- --------- -------- ----------- -------------------- ---------
      test       root       100                                                                                                                                                                                                               UNKN-429+
--------------------------------------------------------------

Users
      Name   Def Acct  Def WCKey                  QOS     Admin       Coord Accounts
---------- ---------- ---------- -------------------- --------- --------------------
     cgray       test                                   Not Set
  csimmons       test                                   Not Set
---------------------------------------------------

User Associations
      User    Account     Share   GrpTRESMins GrpTRESRunMin       GrpTRES GrpJobs GrpJobsAccrue  GrpMem GrpNodes GrpSubmit     GrpWall   MaxTRESMins       MaxTRES MaxTRESPerNode MaxJobs MaxSubmit MaxNodes     MaxWall                  QOS   Def QOS
---------- ---------- --------- ------------- ------------- ------------- ------- ------------- ------- -------- --------- ----------- ------------- ------------- -------------- ------- --------- -------- ----------- -------------------- ---------
     cgray       test
                                                                                                                  UNKN-429+
  csimmons       test
                                                                                                                  UNKN-429+
--------------------------------------------------------------

sacctmgr: Done adding cluster in usec=650196
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
(venv) [hpcadmin@coldfront www]$ coldfront slurm_check -c hpc -n -a test
NOOP enabled
cgray   test    hpc     Remove
csimmons        test    hpc     Remove
        test    hpc     Remove
(venv) [hpcadmin@coldfront www]$ coldfront slurm_check -c hpc -n -a TEST
NOOP enabled