LLNL / serac

Serac is a high order nonlinear thermomechanical simulation code
BSD 3-Clause "New" or "Revised" License
173 stars 30 forks source link

Fix MFEM Petsc Examples with Cuda "your MPI is not GPU-aware" #1158

Open chapman39 opened 2 weeks ago

chapman39 commented 2 weeks ago

MFEM has a series of PETSc examples, which I've been referencing to create our mfem_petsc/ slepc_smoketests. Here is the recommended arguments to run petsc ex1 with device set to cuda: https://github.com/mfem/mfem/blob/master/examples/petsc/CMakeLists.txt#L77

-m ../../data/star.mesh --usepetsc --partial-assembly --device cuda --petscopts rc_ex1p_device

However, this doesn't work for me. I get the following error:

$lalloc 1 -W30 ctest -R mfem_petsc_smoketest --output-on-failure

[3]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI.
[3]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To not see the message again, add the option to your .petscrc, OR add it to the env var PETSC_OPTIONS.
[3]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu.
[3]PETSC ERROR: For Open MPI, you need to configure it --with-cuda (https://www.open-mpi.org/faq/?category=buildcuda)
[3]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (http://mvapich.cse.ohio-state.edu/userguide/gdr/)
[3]PETSC ERROR: For Cray-MPICH, you need to set MPICH_GPU_SUPPORT_ENABLED=1 (man mpi to see manual of cray-mpich)

I probably configured PETSc wrong on lassen, or the petsc spack package may need to be modified. In any case, this is the current PETSc spec from the latest lassen tpl build for reference:

- 42i6t5r ^petsc@3.21.0%clang@10.0.1 ldlibs=-lgfortran ~X~batch~cgns~complex+cuda~debug+double~exodusii~fftw+fortran~giflib~hdf5~hpddm~hwloc+hypre~int64~jpeg~knl~kokkos~libpng~libyaml~memkind+metis~mkl-pardiso~mmg~moab~mpfr+mpi~mumps~openmp~p4est~parmmg~ptscotch~random123~rocm~saws~scalapack~shared~strumpack~suite-sparse+superlu-dist~sycl~tetgen~trilinos~valgrind~zoltan build_system=generic clanguage=C cuda_arch=70 memalign=none arch=linux-rhel7-power9le