ralna / spral

Sparse Parallel Robust Algorithms Library
https://ralna.github.io/spral/
Other
102 stars 27 forks source link

Fix memory fault during scaling of singular matrix #205

Open chrhansk opened 1 month ago

chrhansk commented 1 month ago

Should fix #200

jfowkes commented 1 month ago

@chrhansk unfortunately this change seems to break the main SSIDS test on all platforms.

mjacobse commented 1 month ago

Function hungarian_match is not only called from the scaling API, but also for the matching-based METIS ordering of SSIDS. Function mo_match in match_order.f90 expects unmatched entries to be signaled by negative values: https://github.com/ralna/spral/blob/077bd63ad308cca5be8a5e1f937457ebd6097000/src/match_order.f90#L571

So I would suggest not to change the behaviour of hungarian_match, but instead fix how the unsymmetric scaling code deals with what hungarian_match returns.

Alternatively one could tackle https://github.com/ralna/spral/blob/077bd63ad308cca5be8a5e1f937457ebd6097000/src/match_order.f90#L10-L13 but that might require quite a bit of refactoring?

chrhansk commented 1 month ago

I was not aware of the problem. I tries to zero out the problematic entries manually in the postprocessing function. There still seems to be a problem in ssmfe_ciface_test though. Problem is that I cannot reproduce the error on my system (tests pass without any memory issues). Do you have any ideas?

jfowkes commented 1 month ago

Many thanks @chrhansk, the intermittent SSMFE C test failure is #204 (nothing to do with your changes) which annoyingly I also cannot reproduce on my system making it very difficult to debug and fix. @mjacobse could you review?

chrhansk commented 1 month ago

Looking back at this: In particular in regards to #205: What is the proposed solution that you are converging on? As I understand https://github.com/ralna/spral/issues/200#issuecomment-2178124090 The lines setting the unmatched rows to negative values should be commented out, which to me implies that the values of unmatched rows should be set to zero instead. Am I correct in this regard?

Such a change would necessitate corresponding changes in the implementation of SSIDS, as mentioned here: https://github.com/ralna/spral/pull/205#issuecomment-2169174578

Should I make those changes to the scaling and within SSIDS?

jfowkes commented 1 month ago

Yes we have come to the conclusion that whilst negative row indices (to signal which of the rows are unmatched) make sense for square matrices, this does not make sense for general rectangular matrices. As far as we can tell SSIDS does not make use of the negative row indices themselves, but merely checks if a row is unmatched, so in theory changing the values of unmatched rows to zero should work fine provided we update SSIDS to check for zero rather than negative values.

chrhansk commented 1 month ago

I adjusted the scaling accordingly and added a test for the symmetric singular case. There is however still the problematic part here mentioned in #200. In this case it causes a segfault (if the match array has a size of m < n), so this needs to be addressed.

jfowkes commented 1 month ago

Many thanks @chrhansk, the suggestion has been to comment this problematic section out as it should no longer be required if we now return zero for unmatched entries. I guess it's a case of trying that and seeing if anything breaks?

mjacobse commented 1 month ago

Basically the suggestion was this https://github.com/mjacobse/spral/commit/b815fac89a4ca6bcf5ac8a5801cb4efb1a087fad and indeed, it seems to be all that's needed to fix all issues at once. Personally I would like to see randomly generated singular tests to confirm, since all the random tests right now are nonsingular.

jfowkes commented 4 weeks ago

Indeed many thanks! @chrhansk I suggest we apply https://github.com/mjacobse/spral/commit/b815fac89a4ca6bcf5ac8a5801cb4efb1a087fad to this PR and add a randomly generated singular matrix test to verify that things don't break. We think this should be all that is required now.

chrhansk commented 4 weeks ago

Great, I appreciate your effort.