szcf-weiya / techNotes

My notes about technology.
https://tech.hohoweiya.xyz/
11 stars 7 forks source link

job priority calculation in the cluster #6

Closed szcf-weiya closed 3 years ago

szcf-weiya commented 3 years ago

the formula for job priority is given by

Job_priority =
    site_factor +
    (PriorityWeightAge) * (age_factor) +
    (PriorityWeightAssoc) * (assoc_factor) +
    (PriorityWeightFairshare) * (fair-share_factor) +
    (PriorityWeightJobSize) * (job_size_factor) +
    (PriorityWeightPartition) * (partition_factor) +
    (PriorityWeightQOS) * (QOS_factor) +
    SUM(TRES_weight_cpu * TRES_factor_cpu,
        TRES_weight_<type> * TRES_factor_<type>,
        ...)
    - nice_factor

we can find those weights

$ scontrol show config | grep ^Priority
PriorityParameters      = (null)
PrioritySiteFactorParameters = (null)
PrioritySiteFactorPlugin = (null)
PriorityDecayHalfLife   = 7-00:00:00
PriorityCalcPeriod      = 00:05:00
PriorityFavorSmall      = No
PriorityFlags           = CALCULATE_RUNNING
PriorityMaxAge          = 7-00:00:00
PriorityUsageResetPeriod = NONE
PriorityType            = priority/multifactor
PriorityWeightAge       = 0
PriorityWeightAssoc     = 0
PriorityWeightFairShare = 100000
PriorityWeightJobSize   = 0
PriorityWeightPartition = 0
PriorityWeightQOS       = 0
PriorityWeightTRES      = (null)

only the PriorityWeightFairShare is nonzero, and this agrees with

$ sprio -w
          JOBID PARTITION   PRIORITY       SITE  FAIRSHARE
        Weights                               1     100000
$ sprio -w -p stat
          JOBID PARTITION   PRIORITY       SITE  FAIRSHARE
        Weights                               1     100000
$ sprio -w -p chpc
          JOBID PARTITION   PRIORITY       SITE  FAIRSHARE
        Weights                               1     100000

then the formula can be simplified as

Job_priority =
    site_factor +
    (PriorityWeightFairshare) * (fair-share_factor) +
    SUM(TRES_weight_cpu * TRES_factor_cpu,
        TRES_weight_<type> * TRES_factor_<type>,
        ...)
    - nice_factor

where TRES_weight_<type> might be GPU, see the usage weight in the table https://www.cuhk.edu.hk/itsc/hpc/slurm.html, and a negative nice_factor can only be set by privileged users,

Nice Factor Users can adjust the priority of their own jobs by setting the nice value on their jobs. Like the system nice, positive values negatively impact a job's priority and negative values increase a job's priority. Only privileged users can specify a negative value. The adjustment range is +/-2147483645.

references