AI4OPT / OPFGenerator

Instance generator for OPF problems
MIT License
2 stars 3 forks source link

module load (lmod) statements in slurm template files #40

Closed mtanneau closed 4 months ago

mtanneau commented 4 months ago

Users may run into issues when running slurm jobs using the automated workflow because of missing / not properly loaded modules.

For instance, none of the template files include module load julia (nor its dependencies). The sampler job template contains module load parallel, and the sysimage job template contains module load gcc.

It's worth noting that not all platforms / HPC clusters may use lmod to manage modules, and these may not be consistent across clusters.

The current worfklow works well for someone who has module load julia in their .bashrc (or any statement that lets the OS know julia). It will throw an error if this is not the case though.

Suggested fix:

klamike commented 4 months ago

We can have an environment file that users can modify based on their setup. It would be sourced at the beginning of each slurm job. Default something like:

module load gcc
module load parallel
module load julia

I agree the dependencies should be in the docs.

I like the idea of a test/diagnostic job but this may be overkill. I believe the error messages if the dependencies are not loaded will be pretty informative (PackageCompiler will complain about gcc missing, and bash will complain about parallel missing)