Closed ndellingwood closed 5 months ago
Blocked seems to be a theme in the failing unit-tests but I'm not sure these are the small blocks of a BsrMatrix.
PR's corresponding to the commit list:
@lucbv since the failures are block-related, I'll start triage with a revert of #2008 to see the impact on the tests. MueLu builds take awhile with cuda, so it'll be awhile before I have the breaking change pinpointed
@lucbv revert of #2008 did not resolve the MueLu failures. Rebuilding with a revert of #2011 to retest
Okay, I'm glad #2008 did not generate the issues but unfortunately that will take you a little longer to get to the bottom of it. Except for #1895 the other 3 PRs are fairly light in terms of changes so if they trigger the problem it should still be easy to fix.
Revert of #2011 and #2012 did not help with the MueLu tests, they still failed. Rebuilding with revert of #1895
Revert of #1895 returned MueLu tests to passing
Addressed by #2039, thanks @eeprude !
Nightly cuda/11.2.2 builds (no UVM) are failing in the following unit tests with kokkos-kernels@develop:
https://jenkins-son.sandia.gov/job/KokkosEco_Trilinos_Weaver_CUDA112_opt-no-uvm/257
The
PanzerMiniEM_MiniEM-BlockPrec_MueLu_highOrder_0_MPI_4
was previously reported in #2010 and is failing with release-candidate-4.2.00 as well. The other tests began failing after merge of the following commit:Sparse: fix cusparse spgemm hang properly (detail) Sparse: fix logic for bad cursparse spgemm version. (detail) Improvements on the unification attempt logic for axpby(), including new tests (detail) Addressing feedbacks from Luc, plus some small changes here and there: (detail) Formatting (detail) Using 'ifdef HAVE_KOKKOSKERNELS_DEBUG', per Luc's suggestion (detail) Addressing feedbacks from Luc (detail) Correcting compilation errors in my Mac (detail) Backup (detail) CUDA 11.0.1 / cuSPARSE 11.0.0 changed SpMM enums (detail) CUDA 11.2.1 / cuSPARSE 11.4.0 changed SpMV (detail)
Reproducer (weaver rhel8):