ginkgo-project / ginkgo

Numerical linear algebra software package
https://ginkgo-project.github.io/
BSD 3-Clause "New" or "Revised" License
398 stars 86 forks source link

Data validation and invariants #746

Open upsj opened 3 years ago

upsj commented 3 years ago

We had some discussions about specifying and validating data structure invariants, so I'll list them here:

Matrix formats

Solvers

Factorizations

Preconditioners

greole commented 3 years ago

After discussing this with @upsj, here is how I would like to approach this issue:

  1. A new set of files validation_helpers.{cpp,hpp} inside core/components will be created.
  2. These will contain several validation functions e.g. is_symmetric, is_row_ordered etc which take matrix data and/or indices via array pointers.
  3. Matrices get a new member function validate_data() which makes use of the relevant validation helpers asserts to ensure correctness and throws an error otherwise.
  4. The latter can be switched on/off for debug builds.
upsj commented 3 years ago

I believe we didn't talk about the actual error reporting so far, I could imagine multiple approaches:

  1. validate_data() doesn't return anything and throws an exception with information on which part of the data is invalid
  2. validate_data() doesn't return anything and causes an assertion to fail, which is akin to calling std::abort()
  3. validate_data() returns something like a tagged union of either a success tag or a failure tag with additional information.

I would favor 1 or 3, since 2 is very limited in what kind of information can be provided. 1. is probably easiest to implement, since it relies on the possibility to nest exceptions for nesting validation failures later on. Kind of like the distinction between exception-type (C++) and optional-type (Rust) error handling.

I think we don't necessarily need 4., since this would otherwise require users to rebuild all of Ginkgo in Debug just to be able to figure out which part of their code is failing. Unless of course we call validate_data() in all apply implementations, then I would agree to disable this in Debug, but still allow users to call it manually in Release.

lahwaacz commented 2 years ago

Do you require all column indexes within each matrix row to be sorted/ascending or can they be in arbitrary order? I don't see this in the list above. IIRC, Hypre's AMG behaves differently for each ordering (and they require the diagonal entry to be stored first in the row).

MarcelKoch commented 2 years ago

For our algorithms that require the column indices to be sorted, there is usually a factory parameter called skip_sorting that can be used to signal that the input matrix is already sorted. Otherwise, we sort the matrix correctly. Of course, simple things like SpMV still work with unsorted indices, but also our AMGx requires sorting. Also, when you fill a matrix from matrix_data with the read function, you have to make sure that the input data is also sorted correctly.