Using the updated STRUMPACK interface in PETSc-3.20.0 (with the GPU solve capabilities) results in a crash with the following trace:
Propagator: starting turn 1, final turn 1
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: No support for this operation on this system
[0]PETSC ERROR: SLATE requires MPI_THREAD_MULTIPLE
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.20.0, Sep 28, 2023
[0]PETSC ERROR: ./booster_fd on a named wcgpu03.fnal.gov by sasyed Wed Oct 11 20:05:59 2023
[0]PETSC ERROR: Configure options --prefix=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0 --with-make-exec=make CFLAGS=-O3 COPTFLAGS= FFLAGS=-O3 FOPTFLAGS= CXXFLAGS=-O3 CXXOPTFLAGS= --with-cc=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openmpi-4.1.5-dtavl3prebht2mfzsx7kipn7aasgnaql/bin/mpicc --with-cxx=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openmpi-4.1.5-dtavl3prebht2mfzsx7kipn7aasgnaql/bin/mpic++ --with-fc=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openmpi-4.1.5-dtavl3prebht2mfzsx7kipn7aasgnaql/bin/mpif90 --with-precision=double --with-scalar-type=real --with-shared-libraries=1 --with-debugging=0 --with-openmp=0 --with-64-bit-indices=0 --with-blaslapack-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openblas-0.3.24-jye46jc22kvh6libzwermjws7mm5l6vt/lib/libopenblas.so --with-batch=1 --with-x=0 --with-clanguage=C --with-cuda=1 --with-cuda-dir=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/cuda-12.2.1-asmcxn2vp6whqwygr4jpsh4evklv3zve --with-hip=0 --with-metis=1 --with-metis-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/metis-5.1.0-jpueqfdfrqt6abqjvpityabqjbssdczi/include --with-metis-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/metis-5.1.0-jpueqfdfrqt6abqjvpityabqjbssdczi/lib/libmetis.so --with-hypre=1 --with-hypre-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hypre-2.29.0-yp5i4caa3gk25cazv4itvi2m63k32rv5/include --with-hypre-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hypre-2.29.0-yp5i4caa3gk25cazv4itvi2m63k32rv5/lib/libHYPRE.so --with-parmetis=1 --with-parmetis-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/parmetis-4.0.3-3pxngte3pxm2kvh4eeucnjj3bxosfbwp/include --with-parmetis-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/parmetis-4.0.3-3pxngte3pxm2kvh4eeucnjj3bxosfbwp/lib/libparmetis.so --with-kokkos=0 --with-kokkos-kernels=0 --with-superlu_dist=1 --with-superlu_dist-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/superlu-dist-8.1.2-usj7s2kc6d4kmigu3bljye5gjud6vxt7/include --with-superlu_dist-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/superlu-dist-8.1.2-usj7s2kc6d4kmigu3bljye5gjud6vxt7/lib/libsuperlu_dist.so --with-ptscotch=0 --with-suitesparse=0 --with-hdf5=1 --with-hdf5-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hdf5-1.14.2-afz27n65es4synnsemdew6hjxkhbh2d7/include --with-hdf5-lib="/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hdf5-1.14.2-afz27n65es4synnsemdew6hjxkhbh2d7/lib/libhdf5_hl.so /wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hdf5-1.14.2-afz27n65es4synnsemdew6hjxkhbh2d7/lib/libhdf5.so" --with-zlib=0 --with-mumps=0 --with-trilinos=0 --with-fftw=0 --with-valgrind=0 --with-gmp=0 --with-libpng=0 --with-giflib=0 --with-mpfr=0 --with-netcdf=0 --with-pnetcdf=0 --with-moab=0 --with-random123=0 --with-exodusii=0 --with-cgns=0 --with-memkind=0 --with-p4est=0 --with-saws=0 --with-yaml=0 --with-hwloc=0 --with-libjpeg=0 --with-scalapack=1 --with-scalapack-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/netlib-scalapack-2.2.0-xot4ujctnuvzgyb5rdjua67ab63agqyl/lib/libscalapack.so --with-strumpack=1 --with-strumpack-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/strumpack-7.1.3-i7dsreqo3p7jstuxowejz3jizjnnz4wu/include --with-strumpack-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/strumpack-7.1.3-i7dsreqo3p7jstuxowejz3jizjnnz4wu/lib64/libstrumpack.so --with-mmg=0 --with-parmmg=0 --with-tetgen=0 --with-cuda-arch=70
[0]PETSC ERROR: #1 MatGetFactor_aij_strumpack() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/mat/impls/aij/mpi/strumpack/strumpack.c:1153
[0]PETSC ERROR: #2 MatGetFactor() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/mat/interface/matrix.c:4783
[0]PETSC ERROR: #3 PCFactorSetUpMatSolverType_Factor() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/pc/impls/factor/factimpl.c:9
[0]PETSC ERROR: #4 PCFactorSetUpMatSolverType() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/pc/impls/factor/factor.c:105
[0]PETSC ERROR: #5 init_solver() at /wclustre/accelsim/sajid/packages/synergia2/src/synergia/collective/space_charge_3d_fd_utils.cc:313
[0]PETSC ERROR: #6 init_solver_sc3d_fd() at /wclustre/accelsim/sajid/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc:468
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: It appears a new error in the code was triggered after a previous error, possibly because:
[0]PETSC ERROR: - The first error was not properly handled via (for example) the use of
[0]PETSC ERROR: PetscCall(TheFunctionThatErrors()); or
[0]PETSC ERROR: - The second error was triggered while handling the first error.
[0]PETSC ERROR: Above is the traceback for the previous unhandled error, below the traceback for the next error
[0]PETSC ERROR: ALL ERRORS in the PETSc libraries are fatal, you should add the appropriate error checking to the code
[0]PETSC ERROR: No support for this operation on this system
[0]PETSC ERROR: SLATE requires MPI_THREAD_MULTIPLE
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.20.0, Sep 28, 2023
[0]PETSC ERROR: ./booster_fd on a named wcgpu03.fnal.gov by sasyed Wed Oct 11 20:05:59 2023
[0]PETSC ERROR: Configure options --prefix=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0 --with-make-exec=make CFLAGS=-O3 COPTFLAGS= FFLAGS=-O3 FOPTFLAGS= CXXFLAGS=-O3 CXXOPTFLAGS= --with-cc=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openmpi-4.1.5-dtavl3prebht2mfzsx7kipn7aasgnaql/bin/mpicc --with-cxx=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openmpi-4.1.5-dtavl3prebht2mfzsx7kipn7aasgnaql/bin/mpic++ --with-fc=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openmpi-4.1.5-dtavl3prebht2mfzsx7kipn7aasgnaql/bin/mpif90 --with-precision=double --with-scalar-type=real --with-shared-libraries=1 --with-debugging=0 --with-openmp=0 --with-64-bit-indices=0 --with-blaslapack-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/openblas-0.3.24-jye46jc22kvh6libzwermjws7mm5l6vt/lib/libopenblas.so --with-batch=1 --with-x=0 --with-clanguage=C --with-cuda=1 --with-cuda-dir=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/cuda-12.2.1-asmcxn2vp6whqwygr4jpsh4evklv3zve --with-hip=0 --with-metis=1 --with-metis-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/metis-5.1.0-jpueqfdfrqt6abqjvpityabqjbssdczi/include --with-metis-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/metis-5.1.0-jpueqfdfrqt6abqjvpityabqjbssdczi/lib/libmetis.so --with-hypre=1 --with-hypre-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hypre-2.29.0-yp5i4caa3gk25cazv4itvi2m63k32rv5/include --with-hypre-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hypre-2.29.0-yp5i4caa3gk25cazv4itvi2m63k32rv5/lib/libHYPRE.so --with-parmetis=1 --with-parmetis-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/parmetis-4.0.3-3pxngte3pxm2kvh4eeucnjj3bxosfbwp/include --with-parmetis-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/parmetis-4.0.3-3pxngte3pxm2kvh4eeucnjj3bxosfbwp/lib/libparmetis.so --with-kokkos=0 --with-kokkos-kernels=0 --with-superlu_dist=1 --with-superlu_dist-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/superlu-dist-8.1.2-usj7s2kc6d4kmigu3bljye5gjud6vxt7/include --with-superlu_dist-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/superlu-dist-8.1.2-usj7s2kc6d4kmigu3bljye5gjud6vxt7/lib/libsuperlu_dist.so --with-ptscotch=0 --with-suitesparse=0 --with-hdf5=1 --with-hdf5-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hdf5-1.14.2-afz27n65es4synnsemdew6hjxkhbh2d7/include --with-hdf5-lib="/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hdf5-1.14.2-afz27n65es4synnsemdew6hjxkhbh2d7/lib/libhdf5_hl.so /wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/hdf5-1.14.2-afz27n65es4synnsemdew6hjxkhbh2d7/lib/libhdf5.so" --with-zlib=0 --with-mumps=0 --with-trilinos=0 --with-fftw=0 --with-valgrind=0 --with-gmp=0 --with-libpng=0 --with-giflib=0 --with-mpfr=0 --with-netcdf=0 --with-pnetcdf=0 --with-moab=0 --with-random123=0 --with-exodusii=0 --with-cgns=0 --with-memkind=0 --with-p4est=0 --with-saws=0 --with-yaml=0 --with-hwloc=0 --with-libjpeg=0 --with-scalapack=1 --with-scalapack-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/netlib-scalapack-2.2.0-xot4ujctnuvzgyb5rdjua67ab63agqyl/lib/libscalapack.so --with-strumpack=1 --with-strumpack-include=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/strumpack-7.1.3-i7dsreqo3p7jstuxowejz3jizjnnz4wu/include --with-strumpack-lib=/wclustre/accelsim/spack-shared-v5/spack/opt/spack/linux-scientific7-cascadelake/gcc-12.2.0/strumpack-7.1.3-i7dsreqo3p7jstuxowejz3jizjnnz4wu/lib64/libstrumpack.so --with-mmg=0 --with-parmmg=0 --with-tetgen=0 --with-cuda-arch=70
[0]PETSC ERROR: #1 MatGetFactor_aij_strumpack() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/mat/impls/aij/mpi/strumpack/strumpack.c:1153
[0]PETSC ERROR: #2 MatGetFactor() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/mat/interface/matrix.c:4783
[0]PETSC ERROR: #3 PCFactorSetUpMatSolverType_Factor() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/pc/impls/factor/factimpl.c:9
[0]PETSC ERROR: #4 PCFactorSetUpMatSolverType() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/pc/impls/factor/factor.c:105
[0]PETSC ERROR: #5 PCSetUp_LU() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/pc/impls/factor/lu/lu.c:80
[0]PETSC ERROR: #6 PCSetUp() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/pc/interface/precon.c:1068
[0]PETSC ERROR: #7 KSPSetUp() at /tmp/sasyed/spack-stage/spack-stage-petsc-3.20.0-autl7wxw7ic2zkl7jq4s2mi225qtnu4h/spack-src/src/ksp/ksp/interface/itfunc.c:415
[0]PETSC ERROR: #8 compute_mat() at /wclustre/accelsim/sajid/packages/synergia2/src/synergia/collective/space_charge_3d_fd_utils.cc:420
[0]PETSC ERROR: #9 apply_impl() at /wclustre/accelsim/sajid/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc:140
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 11 DUP FROM 4
with errorcode 57.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
-bash-4.2$
The MPI library used is OpenMPI@4.1.5 which has MPI_THREAD_MULTIPLE support and here is the result of ompi_info : ompi_info.txt. Let me know if I can provide further context to help debug the issue.
Here is the full spec for the PETSc+STRUMPACK build:
Using the updated STRUMPACK interface in PETSc-3.20.0 (with the GPU solve capabilities) results in a crash with the following trace:
The MPI library used is
OpenMPI@4.1.5
which hasMPI_THREAD_MULTIPLE
support and here is the result ofompi_info
: ompi_info.txt. Let me know if I can provide further context to help debug the issue.Here is the full spec for the PETSc+STRUMPACK build: