ROCm / MITuna

MIT License
7 stars 0 forks source link

Centralized user management for SLURM cluster #590

Closed JehandadKhan closed 2 years ago

JehandadKhan commented 2 years ago

As a proof of concept:

@jbakhrai will create a local slurm cluster ( on a dev box) and then integrate it with the AMD Active Directory ( via LDAP or other suitable technology). Then we would test if we can use it to control the quota assigned to each user etc. This might need SLURM support either through slurmctl or the user management backend included already with SLURM.

depending on the results of the above we can decide how we are going to move forward with our main cluster.

Install/Setup a centralized LDAP server for user management of the machines in the slurm cluster.

cc @okakarpa @aserio

aserio commented 2 years ago

@jbakhrai will present what he has learned on Monday.

JehandadKhan commented 2 years ago

@okakarpa et al had a meeting with DevOps to acquire ansible playbooks and @jbakhrai is working through issues on running them on the QTS issues.

aserio commented 2 years ago

Was able to install Samba and Kerberus. Seeing a firewall issue when trying to contact the LDAP server. Will focus on bringing up a test machine first. Suggestion to check out TACC User guide.

aserio commented 2 years ago

Installing LDAP on test machine

aserio commented 2 years ago

Per Omkar: Have LDAP and SLURM Controller set up.

aserio commented 2 years ago

@jbakhrai is OOO