easybuilders / easybuild

EasyBuild - building software with ease
http://easybuild.io
GNU General Public License v2.0
466 stars 143 forks source link

OpenBLAS build fails on Sapphire Rapids due to "Too many LAPACK tests failed" #880

Closed robert-mijakovic closed 11 months ago

robert-mijakovic commented 11 months ago

Hi guys,

while building OpenBLAS on Sapphire Rapids running Ubuntu 22.04LTS, I faced "Too many LAPACK tests failed due to non-numerical errors: 55 (> 0)".

                        -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error         other error
================        ===========     =================       ================
REAL                    1328283         0       (0.000%)        0       (0.000%)
DOUBLE PRECISION        1325997         11      (0.001%)        0       (0.000%)
COMPLEX                 760371          160     (0.021%)        55      (0.007%)
COMPLEX16               771518          48      (0.006%)        0       (0.000%)

--> ALL PRECISIONS      4186169         219     (0.005%)        55      (0.001%)

Are we too restrictive on tolerance? I will look further to see whether I can find a trace of recent patches in OpenBLAS. So far, I haven't seen anything. For the moment, I will stick to --ignore-test-fail when installing OpenBLAS 0.3.24 on Sapphire Rapids.

Best regards, Robert

boegel commented 11 months ago

@robert-mijakovic Can you check whether the changes in https://github.com/easybuilders/easybuild-easyconfigs/pull/19159 are sufficient to fix this problem?

cc @Flamefire

Flamefire commented 11 months ago

I assume though because https://github.com/easybuilders/easybuild-easyconfigs/pull/19021 is an exact duplicate of this and that is fixed by https://github.com/easybuilders/easybuild-easyconfigs/pull/19159 which I verified on our new Sapphire Rapids

robert-mijakovic commented 11 months ago

I managed to get access to the machine again and I'm rebuilding the toolchain. I will let you know the outcome immediately once I finish the test. However, as @Flamefire emphasized, I don't expect to observe any issues with the testing phase as https://github.com/easybuilders/easybuild-easyconfigs/issues/19021 is a duplicate of his https://github.com/easybuilders/easybuild-easyconfigs/pull/19159.

robert-mijakovic commented 11 months ago

I can confirm that https://github.com/easybuilders/easybuild-easyconfigs/pull/19159 fixes the issue.