facebookresearch / dora

Dora is an experiment management framework. It expresses grid searches as pure python files as part of your repo. It identifies experiments with a unique hash signature. Scale up to hundreds of experiments without losing your sanity.
MIT License
269 stars 24 forks source link

How to add the --export=ALL option to srun? #58

Closed Mattias421 closed 10 months ago

Mattias421 commented 10 months ago

❓ Questions

Hi all,

when running grid searches, I run into RuntimeError: Could not figure out which environment the job is runnning in. Known environments: slurm, local, debug.. I have managed to manually fix this by adding the --export=ALL option to the srun command in the generated job scripts, and have seen that this can be done automatically with submitit (slurm_srun_args=["--export=ALL"]). I cannot find a way to do this with dora, are there any tricks to enabling --export=ALL with dora?

Best, Mattias

adefossez commented 10 months ago

First add your new option slurm_srun_args there, typed tp.Optional[tp.List[str]], with default value None, https://github.com/facebookresearch/dora/blob/main/dora/conf.py#L97 Set in the main config.yaml (if using hydra based project)

slurm:
  slurm_srun_args: ["--export=ALL"]

Then I think it should just work out of the box ! let me know if not.

Mattias421 commented 10 months ago

Thanks Alexandre!

It made the submission scripts as expected.

For the versions I'm using I had to use srun_args but apart from that it worked out of the box as you said it would