ORNL / ReSolve

Library of GPU-resident linear solvers
Other
51 stars 2 forks source link

`Sparse::updateData` reads nnz from `nnz_expanded_` if the `is_expanded_` flag is set #176

Open superwhiskers opened 2 months ago

superwhiskers commented 2 months ago

when determining size of the buffers to copy data to, Sparse::updateData reads the number of nonzero values within the matrix from the nnz_expanded_ field if the matrix is marked as expanded, when it should be using nnz_ in all cases, for if the matrix is expanded, nnz_ should be set to nnz_expanded_ anyway. this is confusing behavior, and that branch should be removed in all copies of the Sparse::updateData source to remediate it, which should not fail on correctly behaving code. this branch is present in all copies of this method

superwhiskers commented 2 months ago

this issue is also related to the klu_klu_test failure still present in #175 and #174, as the failing assertion results from an assumption that nnz_ contains the used portion of the allocated space regardless of the value of the is_expanded_. the assumption of the old coo2csr and of the Csr::updateFromCoo method is that nnz_ always contains the number of indices needed by the unexpanded matrix (or the matrix as-is if not symmetric) and that nnz_expanded_ is used if the matrix is symmetric. the old coo2csr will update these fields with the used value counts after deduplication of the main diagonal, which does introduce a potential test failure with unmodified code under the edge case in which the second input matrix used to make the csr matrix has duplicate values on the main diagonal

the newer code in #175 and #174 instead treats nnz_ as always representing the number of indices needed by the matrix in its current state, and ignores nnz_expanded_. this is the cause of the test failure. to fix this issue we'd need to decide upon a convention for these fields that all code must be updated to uphold

my proposal for this is the following:

the rationale is the following: it's more efficient to have a guaranteed single source of truth for the current number of used indices than it is to have two separate locations that you need to branch on to get the number of used indices. additionally, this is the usual case anyway, it doesn't seem to me that most code will care about the number of indices used in the smallest possible representation (the effective behavior of the current code with respect to nnz_) storing the unexpanded size (if there is one) makes this efficient to access should code care about this, but i doubt it's even necessary to do this

one variant of this proposal is to instead merge the functionality of is_expanded_ and nnz_unexpanded_ into a single field which is some sentinel value, such as 0 or -1 if the matrix is not expanded and the number of indices used if unexpanded otherwise. i'm not too sure how i feel about this option, but it's an alternative that should be considered at the least

pelesh commented 11 hours ago

CC @kswirydo