tapios / risk-networks

Code for risk networks: a blend of compartmental models, graphs, data assimilation and semi-supervised learning
Other
2 stars 2 forks source link

Fix ray memory & temp_dir issues #188

Closed dburov190 closed 3 years ago

dburov190 commented 3 years ago

This PR fixes the two small issues that arise on cluster when running in parallel mode.

There are two new arguments:

I lost track who uses which slurm scripts, so haven't updated any, but nudge me if you want something specific changed. I'd say a typical usage would be:

memory=$(( SLURM_MEM_PER_NODE} / 4 )) # or simply memory=4000000000
temp_dir="/central/home/your_name/temp_dir"
mkdir -p "${temp_dir}"

srun python3 backward_forward_assimilation.py \
  # other stuff
  --parallel-memory=${memory} \
  --parallel-temp-dir=${temp_dir}

But this should at least solve the memory issue out of the box; the temp_dir thing, if persists, set it manually.