NVIDIA / TorchFort

An Online Deep Learning Interface for HPC programs on NVIDIA GPUs
https://nvidia.github.io/TorchFort/
Other
154 stars 19 forks source link

docker build uses large amount of memory when running with more than 4 cores #9

Closed TomMelt closed 11 months ago

TomMelt commented 1 year ago

The following line in docker/Dockerfile allows make to build with all available cores.

https://github.com/NVIDIA/TorchFort/blob/e06613d6feccc3d11c166f146abce7abdd85f1b3/docker/Dockerfile#L54

On my laptop I have 12 cores. I ran the docker inside a virtual machine with 8 cores and 8 GB RAM and during the make stage docker quickly consumes all available memory and crashes.

I would suggest setting a limit or advising users to build either serial make install or by overriding the default to something sensible like make -j 4 although sensible depends on available RAM.