LLNL / libROM

Model reduction library with an emphasis on large scale parallelism and linear subspace methods
https://www.librom.net
Other
201 stars 36 forks source link

minor fix and a unit test for NNLSSolver #271

Closed dreamer2368 closed 7 months ago

dreamer2368 commented 7 months ago

minor fix/update for NNLSSolver

n_dist_loc_max fix

In NNLSSolver::solve_parallel_with_scalapack, n_dist_loc_max is now determined from n_tot, not m (as it is supposed to). n_dist_loc_max only determines the memory size of the working array, not impacting the NNLS solution. Before this fix, the following working array variables were allocated with unnecessarily large sizes:

// The following matrices are stored in column-major format as Vectors
Vector mat_0_data(m * n_dist_loc_max, false);
Vector mat_qr_data(m * n_dist_loc_max, false);

hard-coded limit for number of processors

Per issue #269 , the number of processors that can be used for NNLSSolver is limited to 15. This limit is removed and tested through the unit test below.

termination criterion for NNLSSolver

Previously, NNLS iteration is terminated when $L_{\infty}$-norm (the maximum value) of the residual vector is below the threshold. Now NNLSSolver has an option to terminate when $L_2$-norm of the residual vector is below a corresponding threshold. This is set at initialization with the input argument NNLS_termination criterion.

unit test for NNLSSolver

A unit test for NNLSSolver::solve_parallel_with_scalapack is added. This checks the solution approximates the system with desired tolerance, both in serial and parallel.

On LC quartz, the scaling test result is: (for 300x1000) Number of processors Time
1 908 ms
2 769 ms
4 563 ms
8 469 ms
12 463 ms
15 456 ms
Another test with a larger system (500x15000): Number of processors Time
1 24216 ms
2 13287 ms
4 7843 ms
8 5091 ms
12 4162 ms
15 3666 ms

This timing includes the entire setup for matrix-vector system aside from solve_parallel_with_scalapack.