Closed sebastiangrimberg closed 4 months ago
The actual issues seen so far:
- Running the examples tests on a g5.48xlarge, the spheres case is failing with nans for the indicator norm.
This is fixed in a8d0cc9.
- Running the examples with nt > 1, SuperLU seems to cause
Residual norms for GMRES solve ** On entry to DGEMM parameter number 8 had an illegal value -------------------------------------------------------------------------- prterun noticed that process rank 1 with PID 10743 on node c889f3baddb0 exited on signal 11 (Segmentation fault: 11). --------------------------------------------------------------------------
on the
cpw_lumped_uniform
case, whereas usingSTRUMPACK
is fine.
As far as I can tell, this is only happening on M1 macOS with SuperLU_DIST. I'm strongly inclined to say it is a bug there but at least for now it is not related to this PR. One thing to probably explore is for OpenMP builds to build SuperLU_DIST and STRUMPACK without OpenMP and just rely on a threaded BLAS/LAPACK.
EDIT: SuperLU_DIST issue: https://github.com/xiaoyeli/superlu_dist/issues/159
Once this PR, https://github.com/awslabs/palace/pull/193, and https://github.com/awslabs/palace/pull/194 are approved, I will rebase on main
and merge all into main
together.
Adds support for CUDA and HIP in the build system and makes corresponding changes to the code to run on GPUs.
Accompanying documentation update in #185
Resolves https://github.com/awslabs/palace/issues/3
TODO (for Spack support):
libxsmm@=main
dependency)