LLNL / serac

Serac is a high order nonlinear thermomechanical simulation code
BSD 3-Clause "New" or "Revised" License
173 stars 30 forks source link

Utilize MFEM's petsc/ slepc wrappers #1138

Closed chapman39 closed 2 weeks ago

chapman39 commented 1 month ago
chapman39 commented 3 weeks ago

At the moment, BlueOS builds get this warning many times during the serac build:

clang-10: warning: -Wl,-rpath=/usr/WS2/smithdev/libs/serac/blueos_3_ppc64le_ib_p9/2024_06_24_13_38_57/clang-10.0.1/arpack-ng-3.9.0-jf6qgpron7zyy466nlme6v3fdr6vhp7u/lib64: 'linker' input unused [-Wunused-command-line-argument]

This is due to the fact I had to manually add the arpack lib directories in the MFEM module. I could check if the SYS_TYPE is BlueOS, and in that case not add the -Wl,-rpath?

MFEM team is aware of this and are considering adding support to find arpack within MFEM itself. I have an issue on this here: https://github.com/mfem/mfem/issues/4364

chapman39 commented 3 weeks ago

I'm having some issues getting the mfem smoke tests to work with cuda, which is why I've commented them out. I get the following error:

lalloc 1 -W30 ctest -R mfem_petsc_smoketest --output-on-failure

[3]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI.
[3]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To not see the message again, add the option to your .petscrc, OR add it to the env var PETSC_OPTIONS.
[3]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu.
[3]PETSC ERROR: For Open MPI, you need to configure it --with-cuda (https://www.open-mpi.org/faq/?category=buildcuda)
[3]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (http://mvapich.cse.ohio-state.edu/userguide/gdr/)
[3]PETSC ERROR: For Cray-MPICH, you need to set MPICH_GPU_SUPPORT_ENABLED=1 (man mpi to see manual of cray-mpich)

your MPI is not GPU-aware

Is this true? I don't really understand. @white238

EDIT: This is probably because I haven't configured petsc with cuda... I just found out petsc is a CudaPackage, which I'm guessing means it has a cuda variant? https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/petsc/package.py#L10

DOUBLE EDIT: Going back to this, turns out petsc was build with +cuda, so I'm back to not knowing what's going on.

chapman39 commented 2 weeks ago

I am off today but I wanted to make sure CI passed. Looks like only Codevelop is failing, which I'll be able to fix without much issues Monday.