SLURM functionality is very important for larger model runs on clusters. However, currently it is just a bash script which needs to be adapted by each user, which is of limited usefulness.
Goal:
Be able to specify SLURM options in the config file and then just run with the normal python train.py configfile.yaml, which automatically starts and collects the slurm jobs.
SLURM functionality is very important for larger model runs on clusters. However, currently it is just a bash script which needs to be adapted by each user, which is of limited usefulness.
Goal:
python train.py configfile.yaml
, which automatically starts and collects the slurm jobs.Requirements:
Idea:
Culprits: