Closed ndellingwood closed 2 months ago
Hi @cgcgcg and @ndellingwood,
@romintomasetti and I are also seeing this test fail in our HIP build.
It seems one origin of the issue might be this fence in the Cuda case:
It seems such a fence is not present in the corresponding (I think) HIP case:
@maartenarnst Please feel free to open a PR with the adjusted count for HIP. I didn't have time yet to track down where the extra fence was, so thank you very much for doing that!
Hi @cgcgcg. OK, done :) It is PR
The test returned to passing state after @maartenarnst 's fix #13331 , thank you!
@ndellingwood Thanks for reporting back! Can we close the issue then?
Bug Report
@trilinos/tpetra
Description
The
TpetraCore_CrsMatrix_MatvecFence_MPI_4
(introduced in PR #13165 ) is failing with the Hip backend and rocm/5.6.1:Steps to Reproduce
export TRILINOS_DIR=
module load python rocm/5.6.1 cmake openmpi/4.1.5 openblas/0.3.23 ninja/1.11.1 module list export OMPI_CXX=$ROCM_PATH/bin/hipcc export TPETRA_ASSUME_GPU_AWARE_MPI=1
CMake configuration
cmake \ -G"Ninja" \ -DCMAKE_INSTALL_PREFIX=$PWD/install \ -DCMAKE_CXX_STANDARD="17" \ -DCMAKE_CXX_COMPILER="
which mpicxx
" \ -DCMAKE_C_COMPILER="which mpicc
" \ -DCMAKE_FORTRAN_COMPILER="which mpifort
" \ -DCMAKE_BUILD_TYPE="RELEASE" \ -DBUILD_SHARED_LIBS=OFF \ \ -DTrilinos_ENABLE_ALL_PACKAGES=OFF \ -DTrilinos_ENABLE_ALL_OPTIONAL_PACKAGES=OFF \ -DTrilinos_ENABLE_EXPLICIT_INSTANTIATION=ON \ -DTrilinos_ASSERT_MISSING_PACKAGES=OFF \ -DTrilinos_ALLOW_NO_PACKAGES=OFF \ -DTrilinos_ENABLE_OpenMP=OFF \ -DTrilinos_ENABLE_TESTS=ON \ \ -DTrilinos_ENABLE_Amesos2=ON \ -DAmesos2_ENABLE_SuperLU=OFF \ -DAmesos2_ENABLE_KLU2=ON \ -DTrilinos_ENABLE_Belos=ON \ -DTrilinos_ENABLE_Ifpack2=ON \ -DTrilinos_ENABLE_Kokkos=ON \ -DKokkos_ARCH_VEGA90A=ON \ -DKokkos_ENABLE_CUDA=OFF \ -DKokkos_ENABLE_HIP=ON \ -DKokkos_ENABLE_OPENMP=OFF \ -DTrilinos_ENABLE_KokkosKernels=ON \ -DTrilinos_ENABLE_MueLu=ON \ -DTrilinos_ENABLE_Tpetra=ON \ -DTpetra_ENABLE_CUDA=OFF \ -DTpetra_INST_HIP=ON \ -DTpetra_INST_SERIAL=OFF \ -DTpetra_INST_OPENMP=OFF \ -DTpetra_INST_DOUBLE=ON \ -DTrilinos_ENABLE_Gtest=ON \ -DTrilinos_ENABLE_Teuchos=ON \ -DTrilinos_ENABLE_Xpetra=ON \ -DTrilinos_ENABLE_Zoltan2=ON \ -DTrilinos_ENABLE_Panzer=ON \ -DTPL_ENABLE_BLAS=ON \ -D BLAS_LIBRARY_DIRS:FILEPATH="${OPENBLAS_ROOT}/lib" \ -D BLAS_LIBRARY_NAMES:STRING="openblas" \ -DTPL_ENABLE_LAPACK=ON \ -D LAPACK_INCLUDE_DIRS:FILEPATH="${OPENBLAS_ROOT}/include" \ -D LAPACK_LIBRARY_DIRS:FILEPATH="${OPENBLAS_ROOT}/lib" \ -D LAPACK_LIBRARY_NAMES:STRING="openblas" \ -DTPL_ENABLE_Netcdf=OFF \ -DTPL_ENABLE_MPI=ON \ -DMPI_USE_COMPILER_WRAPPERS=ON \ -DMPI_EXEC="mpirun" \ -DMPI_EXEC_NUMPROCS_FLAG="-np" \ -DMPI_EXEC_POST_NUMPROCS_FLAGS:STRING="-bind-to;none" \ \ $TRILINOS_DIRmake -j16
ctest -R TpetraCore_CrsMatrix_MatvecFence_MPI_4 -V