martius-lab / cluster_utils

https://martius-lab.github.io/cluster_utils/
Other
8 stars 0 forks source link

Copy PWD to cache if run_in_working_dir=true? #53

Open luator opened 10 months ago

luator commented 10 months ago

Normally, cluster_utils clones the code from a git repository to a cache directory and uses that to execute the jobs. When setting run_in_working_dir=true in the configuration, this is not happening, instead jobs are executed in the current working directory. This has a potential risk of messing things up if the code in the directory is modified while cluster_utils is still running (i.e. different jobs may use different versions of the code).

To avoid that, we could copy the current working directory to the cache and run from there.

pro:

con:

luator commented 10 months ago

I implemented this option. This is supposed to be an option for advanced users who know what they are doing (there is even a big warning printed out), and the intended way to use cluster_utils for production is to use a git repository. As such, I believe the option should do exactly what it is saying, i.e. run in the working directory, without copying anything to the cache. Making the option "safer" would normalize using it, which would discourage people from using git commits.

Copying the working dir could also have unintended consequences if users store larger amounts of data in their folder, e.g. the virtual environment, data or output checkpoints. I think currently, the project directory is not even removed by cluster utils, see #11.

By Maximilian Seitzer on 2023-11-28T16:32:00 (imported from GitLab)

luator commented 10 months ago

The feature request comes from a discussion with @jfrey (pinging you, in case you want to defend it :) ). I agree with Max, though, that for proper experiments one should use git. So I also would rather keep the behaviour as is.

By Felix Widmaier on 2023-11-28T16:32:00 (imported from GitLab)

luator commented 10 months ago

I have nothing against adding a similar option that copies the directory (though again, only for advanced users who know what they are doing).

By Maximilian Seitzer on 2023-11-28T16:37:56 (imported from GitLab)