eth-cscs / DLA-Future

DLA-Future
https://eth-cscs.github.io/DLA-Future/master/
BSD 3-Clause "New" or "Revised" License
64 stars 14 forks source link

Quick return with identity Houseolder transformation if there are no off-band elements to annihilate #980

Closed RMeli closed 1 year ago

RMeli commented 1 year ago

If the input matrix is banded and has a band size smaller than the target band size for the reduction, nans are produced due to a division by zero when defining tau: https://github.com/eth-cscs/DLA-Future/blob/1fd315b966e97326da34ce0f3f1af43fd8f8a996/include/dlaf/eigensolver/reduction_to_band/impl.h#L102-L105

This PR introduces an early termination when x0_and_squares[1] == 0 by returning tau = 0, thus circumventing the problem. Fix #974. Thanks @albestro for the help in identifying the issue.


With this PR, the situation in CP2K is the following (using the DLA-Future eigensolver for every matrix of size 2x2 or higher):

------------------------------- Summary --------------------------------
Number of FAILED  tests 2
Number of WRONG   tests 0
Number of CORRECT tests 2935
Total number of   tests 2937

Summary: correct: 2935 / 2937; failed: 2; 28min
Status: FAILED

*************************** Testing ended ******************************

All regression tests still returning nans (see #974) pass. Of the two remaining tests one has been fixed in CP2K (missing DLAF/pika initialization when using CP2K as a library), while the other one also fails with ScaLAPACK.

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

Current CI failures should be fixed by #983.

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

Issue reproduced in CI.

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run

RMeli commented 1 year ago

cscs-ci run