Open n01r opened 2 months ago
This logic certainly needs to be more robust. If the there is no SLURM_JOB_PARTITION we could default to "perlmutter_c" settings or just "perlmutter" and give a warning.
Right, of course, it is good practice to do any computation on a compute node. Often, one would still try to run a little test on the head node if not much computation is involved (especially since the head node has an internet connection and one can install missing packages, etc.).
So, a fallback option and a warning would make sure that users will not be confused. :)
I've updated the logic in #1391 I will test when Perlmutter is back from maintenance.
Thanks, Stephen!
Was this addressed by the recent release?
I ran the first script from this optimas example on the Perlmutter head node since there is only one active worker thread. However, libEnsemble detects successfully that I am running on Perlmutter but then it cannot detect a SLURM job partition because I was not running in a job.
This could either be fixed, or a warning should indicate that users should rather use compute jobs. When I did, everything worked.