ralna / spral

Sparse Parallel Robust Algorithms Library
https://ralna.github.io/spral/
Other
106 stars 26 forks source link

Fix num_factor and num_flops info after factorization #109

Closed mjacobse closed 1 year ago

mjacobse commented 1 year ago

According to the documentation, num_factor and num_flops should hold the values "without pivoting after analyse phase, with pivoting after factorize phase". The latter does not seem to work correctly:

Counting for the CPU code is taken directly from how it is done in the GPU code: https://github.com/ralna/spral/blob/353f104a892552450c882530c5ec424914488fb3/src/ssids/gpu/factor.f90#L1441-L1444 I thought about simplifying the loop a bit (e.g. I am not quite sure why the iteration is done backwards) or even implementing closed form solutions for these sums. I decided to mirror the exact implementation from the GPU code (which is also the same as in HSL_MA97) though, as no change seemed relevant enough to me (clang even likes to emit closed form solutions on its own).

I tested a few matrices to make sure that the values after analysis and factorization are the same in the posdef case. For the indef case I set nemin=8 and pivot_method=3 and compared to HSL_MA97 with default options, both using Metis 4 ordering. On some matrices I see identical values that way. On others the results are pretty close though not identical, but that is probably an expected differece in pivoting between the two solvers.

All these tests are on CPU only. That also means that I could not actually verify whether the issue of almost double the values being reported happens as described on GPU, without these changes. It is my best understanding of the code though.