Closed mtanneau closed 4 months ago
My bad, this mostly got fixed in #37 (at least the easy part)
That being said, the extract
job requests a total memory, not a per-cpu memory.
The inconsistency is annoying but IMO it makes sense to think about total memory for extract and per-cpu memory for sampler.
It would be nice to use the streaming thing you were talking about. Not sure how to do this in HDF5.jl.
Agreed, it makes more sense to have a total memory requirement for extract and per-cpu for sampler.
What threw me off is that the sbatch file extract.sbatch
had a "memory per processor" comment. That came from the template here:
I blindly copy-pasted the default config and ran into memory issues when I hit 300 buses. I think we can increase the default value to 64GB or 128GB here: https://github.com/AI4OPT/OPFGenerator/blob/e3efb8a10a9493daa446c3ddf4f532171b05b67c/slurm/submit_jobs.jl#L19
... and have a user-friendly check when we run submit_jobs.jl
.
For instance, using some basic estimate like {number of buses} * {number of samples} * {number of OPF formulations}
, we could throw a warning if we think the memory requirement is too low.
We currently set a default memory of
16gb
for the extract job. That's not enough for large datasets, because the current post-processing code loads everything into memory, then saves a single HDF5 file.Good solution: make that code more memory efficient 😎 Easy solution: increase the memory requirement to ~168gb