easybuilders / easybuild-easyblocks

Collection of easyblocks that implement support for building and installing software with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
106 stars 285 forks source link

LAPACK tests for OpenBLAS not controlled by maxparallel #3441

Open dithwick opened 2 months ago

dithwick commented 2 months ago

Hi,

While building OpenBLAS-0.3.24-GCC-13.2.0.eb on an AMD 7742 (with hyperthreading enabled so 256 logical cores), I noticed that the lapack tests were taking a very long time to run (more than a day) because the tests were starting a thread per logical core. For example, with the xeigtsts test I was observing:

$ ps -ef | grep 145558
username  145558  145557 99 16:31 pts/3    00:14:37 /dev/shm/username/build/OpenBLAS/0.3.24/GCC-13.2.0/OpenBLAS-0.3.24/lapack-netlib/TESTING/EIG/xeigtsts

$ ps -o nlwp 145558
NLWP
 256

I'm aware of the changes to the framework to set a default for maxparallel (in https://github.com/easybuilders/easybuild-framework/pull/4606) but setting this manually in the easyconfig file did not control the number of threads used for the test. In the end, adding the following to the easyconfig worked:

pretestopts = 'export OMP_NUM_THREADS=16 &&'

After running the build again, this took about 5 minutes to complete the tests instead of many hours.

Rather than adding this to all of the easyconfig files, I/we were wondering if there is a way of adding this to the easyblock so there's a default and/or controlled by maxparallel by editing the line https://github.com/easybuilders/easybuild-easyblocks/blob/485a195d324e51a213762ab3fd0bc492ad70cfcf/easybuild/easyblocks/o/openblas.py#L141 somehow (see the conversation on slack)?

Thanks