Open andybroth opened 3 months ago
Fix for now, just update the default slurm directory in the file to whatever you set in launch_parallel_jobs.py
shutil.copyfile(baseconfig_filename, os.path.join(slurm_dir,f'.yaml'))
Line 26 in launch_parallel_jobs_function.py
I think this line is the issue and can maybe just be commented out @HiroFarre?
Because MDS crashes, there was a fix to check for an MDS crash and go on to the next shot. When this happens, it relaunches but uses the default slurm directory option in
submit_single_run()
inlaunch_parallel_jobs_function.py
. This is currently not updated based on the slurm directory when you run thelaunch_parallel_jobs.py
script.Because of this bug, it ends up crashing when MDS fails because this slurm directory doesn't exist.