Azure / cyclecloud-slurm

Azure CycleCloud project to enable users to create, configure, and use Slurm HPC clusters.
MIT License
54 stars 40 forks source link

slurm.dampen_memory was ignored #219

Closed ryanhamel closed 4 months ago

ryanhamel commented 4 months ago

Fixes: #30

Related new documentation:

Slurm requires that you define the amount of free memory, after OS/Applications are considered, when reporting memory as a resource. If the reported memory is too low, then Slurm will reject this node. To overcome this, by default we dampen the memory by 5% or 1g, whichever is larger.

To change this dampening, there are two options. 1) You can define slurm.dampen_memory=X where X is an integer percentage (5 == 5%) 2) Crate a default_resource definition in the /opt/azurehpc/slurm/autoscale.json file.

    "default_resources": [
    {
      "select": {},
      "name": "slurm_memory",
      "value": "node.memory"
    }
  ],

Default resources are a powerful tool that the underlying library ScaleLib provides. see the ScaleLib documentation