Closed BenWibking closed 1 year ago
Hi @BenWibking , I will try to fix the installation for 5.4.3 when setonix comes back from maintenance.
This is still not working for us.
Hi @BenWibking , I am still trying to work out why it didn't produce hiprand. And I likely cannot update the module to produce a new rocm build as it might disrupt others. However, I am looking for a non-disruptive way of updating the software stack.
Hi @BenWibking , just to clarify: The installation process for rocm is pretty terrible and constantly changing so it is hard to keep track of how to change installation scripts. However, the current issue is that rocrand (curand equivalent) is there but for reasons that are unclear to me, the hiprand includes, which define the hip API, were not produced. I am very confused as to why it wasn't.
@dipietrantonio How did you fix this problem when I reported this a few months ago?
Hi @BenWibking , I've updated the installation scripts and now it should be there. Can you try it?
Hi @BenWibking @pelahi , as pointed out in the first comment, this was simply an oversight on my side, so many projects! I just need to add the cake_build line specified in the mentioned issue.
It still does not work:
bwibking@setonix-02:/software/projects/pawsey0807/bwibking/quokka/build> cmake .. -C ../cmake/setonix.cmake
loading initial cache file ../cmake/setonix.cmake
-- Cray Programming Environment 2.7.20 CXX
-- Cray Programming Environment 2.7.20 C
-- CMake version: 3.24.3
-- AMReX installation directory: /usr/local
-- Build type set by user to 'Release'.
-- Building AMReX with AMReX_SPACEDIM = 3
-- Configuring AMReX with the following options enabled:
-- AMReX_PRECISION = DOUBLE
-- AMReX_GPU_BACKEND = HIP
-- AMReX_GPU_MAX_THREADS = 256
-- AMReX_GPU_RDC
-- AMReX_MPI
-- AMReX_LINEAR_SOLVERS
-- AMReX_PARTICLES
-- AMReX_PARTICLES_PRECISION = DOUBLE
-- AMReX_TINY_PROFILE
-- hip::amdhip64 is SHARED_LIBRARY
-- Found HIP:
-- HIP: Runtime=rocclr Compiler=clang Path=/software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1/rocm-5.4.3rev1
-- hip::amdhip64 is SHARED_LIBRARY
-- hip::amdhip64 is SHARED_LIBRARY
CMake Error at extern/amrex/Tools/CMake/AMReXParallelBackends.cmake:288 (find_package):
Could not find a package configuration file provided by "hiprand" with any
of the following names:
hiprandConfig.cmake
hiprand-config.cmake
Add the installation prefix of "hiprand" to CMAKE_PREFIX_PATH or set
"hiprand_DIR" to a directory containing one of the above files. If
"hiprand" provides a separate development package or SDK, be sure it has
been installed.
Call Stack (most recent call first):
extern/amrex/Src/CMakeLists.txt:40 (include)
These are the modules I have loaded:
bwibking@setonix-02:/software/projects/pawsey0807/bwibking/quokka/build> module list
Currently Loaded Modules:
1) craype-x86-milan 5) xpmem/2.5.2-2.4_3.47__gd0f7936.shasta 9) slurm/22.05.2 13) PrgEnv-cray/8.3.3 17) cray-mpich/8.1.25
2) libfabric/1.15.2.0 6) pawseyenv/2023.08 10) craype/2.7.20 14) craype-accel-amd-gfx90a 18) cray-hdf5/1.12.2.3
3) craype-network-ofi 7) pawsey 11) cray-dsmml/0.2.2 15) cce/15.0.1 19) cray-python/3.9.13.1
4) perftools-base/23.03.0 8) pawseytools 12) cray-libsci/23.02.1.1 16) cmake/3.24.3 20) rocm/5.4.3
A couple points that may be helpful (or not?) here
Looks like rocrand is available in the 5.4.3 install, but hiprand is not
jschoonover@setonix-02:~> ls -l /software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1/rocm-5.4.3rev1
total 164
lrwxrwxrwx 1 spack spack 85 Sep 15 13:47 amdgcn -> /software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1//rocm-5.4.3rev1/llvm/amdgcn
drwxr-sr-x 5 spack spack 4096 Sep 15 14:36 atmi
drwxr-xr-x 2 spack spack 4096 Sep 20 13:40 bin
drwxr-sr-x 6 spack spack 4096 Sep 15 13:51 hip
drwxr-xr-x 4 spack spack 4096 Sep 15 14:18 hipblas
drwxr-xr-x 3 spack spack 4096 Sep 15 14:19 hipcub
drwxr-xr-x 4 spack spack 4096 Sep 15 14:27 hipfft
drwxr-sr-x 3 spack spack 4096 Sep 15 14:28 hipfort
drwxr-xr-x 7 spack spack 4096 Sep 15 13:52 hipify
drwxr-xr-x 4 spack spack 4096 Sep 15 14:18 hipsolver
drwxr-xr-x 4 spack spack 4096 Sep 15 14:19 hipsparse
drwxr-sr-x 4 spack spack 4096 Sep 15 13:47 hsa
drwxr-xr-x 30 spack spack 12288 Sep 20 13:40 include
lrwxrwxrwx 1 spack spack 79 Sep 15 13:17 lib -> /software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1//rocm-5.4.3rev1/lib64
drwxr-sr-x 6 spack spack 12288 Sep 20 13:40 lib64
drwxr-sr-x 6 spack spack 4096 Sep 15 14:36 libexec
drwxr-sr-x 9 spack spack 4096 Sep 15 13:47 llvm
drwxr-sr-x 4 spack spack 4096 Sep 15 13:49 oam
drwxr-sr-x 7 spack spack 4096 Sep 15 13:49 opencl
drwxr-sr-x 4 spack spack 4096 Sep 15 14:36 rccl
drwxr-xr-x 4 spack spack 4096 Sep 15 14:18 rocalution
drwxr-xr-x 4 spack spack 4096 Sep 15 14:09 rocblas
drwxr-xr-x 4 spack spack 4096 Sep 15 14:27 rocfft
-rw-r--r-- 1 spack spack 4403 Sep 15 23:02 rocm_setup.sh
drwxr-sr-x 4 spack spack 4096 Sep 15 13:49 rocm_smi
drwxr-xr-x 4 spack spack 4096 Sep 15 14:16 rocprim
drwxr-sr-x 6 spack spack 4096 Sep 15 13:51 rocprofiler
drwxr-xr-x 4 spack spack 4096 Sep 15 14:09 rocrand
drwxr-xr-x 4 spack spack 4096 Sep 15 14:16 rocsolver
drwxr-xr-x 4 spack spack 4096 Sep 15 14:17 rocsparse
drwxr-xr-x 3 spack spack 4096 Sep 15 14:28 rocthrust
drwxr-sr-x 4 spack spack 4096 Sep 15 13:51 roctracer
drwxr-xr-x 11 spack spack 4096 Sep 15 14:28 share
drwxr-sr-x 3 spack spack 4096 Sep 15 13:52 src
drwxr-sr-x 3 spack spack 4096 Sep 15 13:51 test
drwxr-sr-x 3 spack spack 4096 Sep 15 13:53 x86_64-pc-linux-gnu
@dipietrantonio - is this the line you're talking about here ?
I also noticed that the CMAKE_PREFIX_PATH
environment variable is not set when the rocm module is loaded ? Perhaps it would also be helpful to define/append to this environment variable a few paths to help out CMake. For ROCm 5.4.3, I believe that rocrand and hiprand configs are at
${ROCM_INSTALL_DIR}/rocrand/lib/cmake/
${ROCM_INSTALL_DIR}/hiprand/lib/cmake/
In the modulefiles, you could add
export CMAKE_PREFIX_PATH=$CMAKE_PREFIX_PATH:${ROCM_INSTALL_DIR}/rocrand/lib/cmake/:${ROCM_INSTALL_DIR}/hiprand/lib/cmake/
Hi @BenWibking I misread your comment yesterday. What I did to fix this a few months ago is to simply interactively run the line
cmake -DCMAKE_INSTALL_PREFIX=$ROCM_PATH -DHIP_ROOT=$ROCM_PATH -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_C_COMPILER=hipcc -DCMAKE_BUILD_TYPE=Release -DAMDGPU_TARGETS=gfx90a ..
within the build directory created by the scripts. hipRAND was cloned but not installed. @pelahi you did the installation for the PrgEnv-cray, so you know where this build directory is. You should simply move into the hipRAND project folder and run the above line (exporting the required variables). I suspect Pascal added the above line into the scripts but did not actually execute this.
@fluidnumerics-joe I refer to the line pointed out in #16 . Also, you make a valid point about setting the CMAKE_PREFIX_PATH variable, and in fact it has already been pointed out in #14 . Thanks!
@dipietrantonio - late to the game on my part ; thanks for filling me in here.
I am still getting the same error:
CMake Error at extern/amrex/Tools/CMake/AMReXParallelBackends.cmake:288 (find_package):
Could not find a package configuration file provided by "hiprand" with any
of the following names:
hiprandConfig.cmake
hiprand-config.cmake
Add the installation prefix of "hiprand" to CMAKE_PREFIX_PATH or set
"hiprand_DIR" to a directory containing one of the above files. If
"hiprand" provides a separate development package or SDK, be sure it has
been installed.
Call Stack (most recent call first):
extern/amrex/Src/CMakeLists.txt:40 (include)
Hi @BenWibking , so on setonix, if you use the PrgEnv-gnu
module list to load rocm/5.4.3
you'll see that the hiprand stuff is there
$ module load rocm/5.4.3
$ lfs find ${HIP_PATH} -name "*hiprand*"
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/libhiprand.so.1.1
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-targets.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-config-version.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-config.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-fortran-config.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-targets-release.cmake
Maybe have a look as to why the cmake macro is not finding it.
@pelahi For various reasons, we have to use the PrgEnv-cray environment.
@BenWibking , the PrgEnv-cray
has been updated. Can you try it now?
@BenWibking , the
PrgEnv-cray
has been updated. Can you try it now?
It works now. Thanks.
We can't compile our code with rocm/5.4.3 because hipRAND is missing:
Presumably, this is caused by: https://github.com/PawseySC/rocm-from-source/issues/16
I opened a help ticket about this before, but it took several days to get routed to you all. Can you fix this on Setonix?