Restrict node resources

Add stacked cgroup and affinity TaskPlugins for slurm as recommended to restrict jobs to requested node resources.

Note this requires cgroups.conf to be defined which it is by the currently-used openhpc role, see here.

Given the non-obvious nature of this change manual testing was carried out as follows.

Compute nodes were changed to Leafcloud en1.medium to get 2x cpus.

Note sbatch can't be used directly to test restriction works OK:

$ sbatch --ntasks=1 --wrap "srun --ntasks=2 hostname"
srun: error: Unable to create step for job 2: More processors requested than permitted

So a python multiprocessing program mp.py was created (see below) and run:

$ sbatch -n 1 mp.py

[rocky@debug-login-0 tests]$ cat slurm-4.out cat slurm-5.out 
# without TaskPlugin
n procs: 2
Hello from Process 1! - Running on CPU core(s): {0, 1}
Hello from Process 2! - Running on CPU core(s): {0, 1}
Both processes have finished.

# with TaskPlugin: task/cgroup,task/affinity
n procs: 2
Hello from Process 2! - Running on CPU core(s): {0}
Hello from Process 1! - Running on CPU core(s): {0}
Both processes have finished.

All the above was carried out using (default) RL9. Additional testing was also carried out using RL8:

site: OK
hpctests: OK

mp.py: OK


[rocky@debug-login-0 tests]$ sbatch -n 1 mp.py
Submitted batch job 11
[rocky@debug-login-0 tests]$ cat slurm-11.out 
n procs: 2
Hello from Process 1! - Running on CPU core(s): {0}
Hello from Process 2! - Running on CPU core(s): {0}
Both processes have finished.

checked for slurmctld errors: OK

mp.py:


import multiprocessing, os

print('n procs:', multiprocessing.cpu_count())

def print_message(message):
    core_number = os.sched_getaffinity(0)
    print(f"{message} - Running on CPU core(s): {core_number}")

if __name__ == "__main__":
    # Define messages for each process
    messages = ["Hello from Process 1!", "Hello from Process 2!"]

    # Create two processes
    processes = []
    for msg in messages:
        process = multiprocessing.Process(target=print_message, args=(msg,))
        processes.append(process)

    # Start each process
    for process in processes:
        process.start()

    # Wait for both processes to finish
    for process in processes:
        process.join()

    print("Both processes have finished.")

stackhpc / ansible-slurm-appliance

Restrict node resources #388