Excessive memory usage during compilation with pip

lab-cosmo / sphericart

Multi-language library for the calculation of spherical harmonics in Cartesian coordinates

https://sphericart.readthedocs.io/en/latest/

MIT License

73 stars 13 forks source link

Excessive memory usage during compilation with pip #115

Closed sirmarcel closed 4 months ago

sirmarcel commented 7 months ago

Currently, attempting to build the sphericart-torch wheel with pip requires a large amount of RAM if many CPU cores are present. I think this is due to this line, which invokes cmake without specifying the number of jobs, which presumably will default to the total number of cores. On a HPC system those can be 40 or 80, and so compilation tends to get killed by the host OS.

While this is not catastrophic, it is inconvenient, and a waste of resources in many cases (the compilation is not much faster in parallel mode). I would suggest defaulting to some reasonable default instead, or disabling parallel builds entirely. Alternatively, the installation docs should at least mention this fact (see #116).

nickjbrowning commented 7 months ago

Thanks for the find, this is a very good point. I'll address this in a PR tomorrow.

sirmarcel commented 7 months ago

Thanks @nickjbrowning !

Luthaf commented 7 months ago

One thing I don't understand here is that we don't have that many files to compile, so make -j and make -j8 should have the same behavior (launch ~8 compilation jobs).

sirmarcel commented 7 months ago

It's a bit suspicious. My observation is: (a) compilation dies with kill on the default allocation on izar (4GB I believe), (b) if you remove --parallel from the setup.py file of sphericart-torch, it works without problem, (c) requesting a node with 32GB also works, without modification.

Luthaf commented 7 months ago

Oh, right. I can see the compiler requiring a couple of GiB per file (there are a lot of torch header to parse and template to instantiate), so parallel compilation would fail with only 4GiB of available RAM. But then the changed by @nickjbrowning would not fix it here, since the compilation would also fail with only 8 jobs.

nickjbrowning commented 4 months ago

I've added these two environment variables to the build process:

SPHERICART_PARALLEL_BUILD=ON
SPHERICART_JOBS=NJOBS

So you can now control the number of build jobs via:

SPHERICART_PARALLEL_BUILD=OFF pip install .[torch] #disables parallel builds
SPHERICART_JOBS=4 pip install .[torch] #uses 4 jobs for compilation