Closed victorapm closed 3 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 55.74%. Comparing base (
c74702a
) to head (ac3b832
). Report is 89 commits behind head on develop.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@sframba Do you know if CUDA 12 is supported on Pangea3 @matteofrigo5 Same question on Sherlock
@sframba Do you know if CUDA 12 is supported on Pangea3 @matteofrigo5 Same question on Sherlock
Not sure, we'll have to ask the IBM support. We have upgraded to cuda 11.5.0
at the end of 2023. Is cuda 12 needed to solve the externalSolvers
unit test issue? Or to access some new hypre
features on GPU?
I don't see any constraints regarding Sherlock. We have these four versions of CUDA 12 installed: (12.0.0 12.1.1 12.2.0 12.4.0). @victorapm, do you suggest any one in particular?
Not sure, we'll have to ask the IBM support. We have upgraded to
cuda 11.5.0
at the end of 2023. Is cuda 12 needed to solve theexternalSolvers
unit test issue? Or to access some newhypre
features on GPU?
It's mainly for this.
@rrsettgast @wrtobin @castelletto1 I have installed the tpls for they ruby builds and for the lassen cuda-12 one.
On ruby
:
make
) and both unit (apart from the doxygen one) and intergratedTests pass./usr/tce/packages/gcc/gcc-12.1.1/bin/ld: /usr/gapps/GEOSX/thirdPartyLibs/2024-06-19/install-ruby-gcc-12-release/trilinos/lib/libtpetra.so.13.4.1: undefined reference to `KokkosSparse::Impl::SPMV<double const, int const, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u>, int const, double const*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<3u>, double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u>, true, true>::spmv(KokkosKernels::Experimental::Controls const&, char const*, double const&, KokkosSparse::CrsMatrix<double const, int const, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u>, int const> const&, Kokkos::View<double const*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<3u> > const&, double const&, Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> > const&)'
we are on an oldish version of Trilinos (13.4.1) which is from 2022. We could try to move the latest release (https://github.com/trilinos/Trilinos/releases) which is from a couple of months ago and see if that fixes the issue.
lassen
build with cuda-12 yet but I think @victorapm did and should be working.I don't have a bank on dane
yet so I can't do much there but I am pretty sure that we can get everything to work on that system too.
Anyways, since we have 1 fully working build, If you are okay with it, I think we can merge this and abandon quartz in favor of ruby.
I have not tested the new lassen build with cuda-12 yet but I think @victorapm did and should be working.
After some trial and error with different compilers, it works!
PS: the build on Dane also works fine (tested it with a couple of simulations)
@CusiniM aren't dane and ruby the same stack? Can we merge the files?
yeah, that's why I created that llnl-cpu-base.cmake
. I think it's convenient to keep the file separate if we ever want/need to customize something but yeah, they should be identical AFAIK. I would also expect binaries to work for both systems. The only difference would be the 1 ats parameter. I don't have a bank on dane yet so I have not been able to do much testing there.
Add new host configs:
See https://github.com/GEOS-DEV/LvArray/pull/319 and https://github.com/GEOS-DEV/thirdPartyLibs/pull/270