Closed mgates3 closed 1 year ago
One bug that was discovered and fixed is that gbtrf
and hetrf
were calling internal::getrf_panel
with pivot_threshold = max_panel_threads, max_panel_threads = priority_1, and priority = 0 (default value), due to pivot_threshold being added as an argument. The default values were removed to avoid this. Code change in gbtrf
:
internal::getrf_panel<Target::HostTask>(
- A.sub(k, i_end-1, k, k), diag_len, ib, pivots.at(k),
- max_panel_threads, priority_1 );
+ A.sub(k, i_end-1, k, k), diag_len, ib, pivots.at(k),
+ pivot_threshold, max_panel_threads, priority_1, tag_0, &info );
(I moved pivots.at(k)
up a line for clarity here.)
I was able to un-stick the deadlock by calling internal::getrf_panel
in hetrf
with max_panel_threads=1
. It looks like the variable shift issue with the pivot threshold made it so that the master branch is setting max_panel_threads=priority_1=1
. (See this minor change which results in a successful CI run)
But, I'm not sure why using multiple threads in hetrf's panel causes a deadlock. Presumably, it didn't before threshold pivoting was added and messed up what was being passed in for max_panel_threads
.
gpu_nvidia error was out of memory, in function stream_create
, most likely an error on the CI machine due to a user allocating the whole GPU. Needs rerun.
...
Passing now.
[Depends on #115]
Adds info error handling to LU and Aasen symmetric indefinite factorization and solves. Abbreviated output [outdated]:
Currently, one inconsistency is
zerocolN
takes 0-based index N in [ 0, n-1 ], while returned info is 1-based index in [ 1, n ]. info = 0 is generally considered to mean "no error". For instance, above, rand_zerocol13 has info = 14.I guess if info != 0, then it should be marked as "pass", since the routine is correctly catching the singularity. Also, the tester should inspect U and verify that U( i, i ) == 0. [Updated]