PawseySC / rocm-from-source

Scripts to build AMD ROCm from source.
MIT License
15 stars 3 forks source link

hipRAND is missing from rocm/5.4.3 in PrgEnv-cray on Setonix #19

Closed BenWibking closed 1 year ago

BenWibking commented 1 year ago

We can't compile our code with rocm/5.4.3 because hipRAND is missing:

CMake Error at extern/amrex/Tools/CMake/AMReXParallelBackends.cmake:288 (find_package):
  Could not find a package configuration file provided by "hiprand" with any
  of the following names:

    hiprandConfig.cmake
    hiprand-config.cmake

  Add the installation prefix of "hiprand" to CMAKE_PREFIX_PATH or set
  "hiprand_DIR" to a directory containing one of the above files.  If
  "hiprand" provides a separate development package or SDK, be sure it has
  been installed.
Call Stack (most recent call first):
  extern/amrex/Src/CMakeLists.txt:40 (include)

Presumably, this is caused by: https://github.com/PawseySC/rocm-from-source/issues/16

I opened a help ticket about this before, but it took several days to get routed to you all. Can you fix this on Setonix?

pelahi commented 1 year ago

Hi @BenWibking , I will try to fix the installation for 5.4.3 when setonix comes back from maintenance.

BenWibking commented 1 year ago

This is still not working for us.

pelahi commented 1 year ago

Hi @BenWibking , I am still trying to work out why it didn't produce hiprand. And I likely cannot update the module to produce a new rocm build as it might disrupt others. However, I am looking for a non-disruptive way of updating the software stack.

pelahi commented 1 year ago

Hi @BenWibking , just to clarify: The installation process for rocm is pretty terrible and constantly changing so it is hard to keep track of how to change installation scripts. However, the current issue is that rocrand (curand equivalent) is there but for reasons that are unclear to me, the hiprand includes, which define the hip API, were not produced. I am very confused as to why it wasn't.

BenWibking commented 1 year ago

@dipietrantonio How did you fix this problem when I reported this a few months ago?

pelahi commented 1 year ago

Hi @BenWibking , I've updated the installation scripts and now it should be there. Can you try it?

dipietrantonio commented 1 year ago

Hi @BenWibking @pelahi , as pointed out in the first comment, this was simply an oversight on my side, so many projects! I just need to add the cake_build line specified in the mentioned issue.

BenWibking commented 1 year ago

It still does not work:

bwibking@setonix-02:/software/projects/pawsey0807/bwibking/quokka/build> cmake .. -C ../cmake/setonix.cmake
loading initial cache file ../cmake/setonix.cmake
-- Cray Programming Environment 2.7.20 CXX
-- Cray Programming Environment 2.7.20 C
-- CMake version: 3.24.3
-- AMReX installation directory: /usr/local
-- Build type set by user to 'Release'.
-- Building AMReX with AMReX_SPACEDIM = 3
-- Configuring AMReX with the following options enabled:
--    AMReX_PRECISION = DOUBLE
--    AMReX_GPU_BACKEND = HIP
--    AMReX_GPU_MAX_THREADS = 256
--    AMReX_GPU_RDC
--    AMReX_MPI
--    AMReX_LINEAR_SOLVERS
--    AMReX_PARTICLES
--    AMReX_PARTICLES_PRECISION = DOUBLE
--    AMReX_TINY_PROFILE
-- hip::amdhip64 is SHARED_LIBRARY
-- Found HIP:
-- HIP: Runtime=rocclr Compiler=clang Path=/software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1/rocm-5.4.3rev1
-- hip::amdhip64 is SHARED_LIBRARY
-- hip::amdhip64 is SHARED_LIBRARY
CMake Error at extern/amrex/Tools/CMake/AMReXParallelBackends.cmake:288 (find_package):
  Could not find a package configuration file provided by "hiprand" with any
  of the following names:

    hiprandConfig.cmake
    hiprand-config.cmake

  Add the installation prefix of "hiprand" to CMAKE_PREFIX_PATH or set
  "hiprand_DIR" to a directory containing one of the above files.  If
  "hiprand" provides a separate development package or SDK, be sure it has
  been installed.
Call Stack (most recent call first):
  extern/amrex/Src/CMakeLists.txt:40 (include)

These are the modules I have loaded:

bwibking@setonix-02:/software/projects/pawsey0807/bwibking/quokka/build> module list

Currently Loaded Modules:
  1) craype-x86-milan         5) xpmem/2.5.2-2.4_3.47__gd0f7936.shasta   9) slurm/22.05.2          13) PrgEnv-cray/8.3.3        17) cray-mpich/8.1.25
  2) libfabric/1.15.2.0       6) pawseyenv/2023.08                      10) craype/2.7.20          14) craype-accel-amd-gfx90a  18) cray-hdf5/1.12.2.3
  3) craype-network-ofi       7) pawsey                                 11) cray-dsmml/0.2.2       15) cce/15.0.1               19) cray-python/3.9.13.1
  4) perftools-base/23.03.0   8) pawseytools                            12) cray-libsci/23.02.1.1  16) cmake/3.24.3             20) rocm/5.4.3
fluidnumerics-joe commented 1 year ago

A couple points that may be helpful (or not?) here

Looks like rocrand is available in the 5.4.3 install, but hiprand is not

jschoonover@setonix-02:~> ls -l /software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1/rocm-5.4.3rev1
total 164
lrwxrwxrwx  1 spack spack    85 Sep 15 13:47 amdgcn -> /software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1//rocm-5.4.3rev1/llvm/amdgcn
drwxr-sr-x  5 spack spack  4096 Sep 15 14:36 atmi
drwxr-xr-x  2 spack spack  4096 Sep 20 13:40 bin
drwxr-sr-x  6 spack spack  4096 Sep 15 13:51 hip
drwxr-xr-x  4 spack spack  4096 Sep 15 14:18 hipblas
drwxr-xr-x  3 spack spack  4096 Sep 15 14:19 hipcub
drwxr-xr-x  4 spack spack  4096 Sep 15 14:27 hipfft
drwxr-sr-x  3 spack spack  4096 Sep 15 14:28 hipfort
drwxr-xr-x  7 spack spack  4096 Sep 15 13:52 hipify
drwxr-xr-x  4 spack spack  4096 Sep 15 14:18 hipsolver
drwxr-xr-x  4 spack spack  4096 Sep 15 14:19 hipsparse
drwxr-sr-x  4 spack spack  4096 Sep 15 13:47 hsa
drwxr-xr-x 30 spack spack 12288 Sep 20 13:40 include
lrwxrwxrwx  1 spack spack    79 Sep 15 13:17 lib -> /software/setonix/2023.08/pawsey/software/rocm/cce/15.0.1//rocm-5.4.3rev1/lib64
drwxr-sr-x  6 spack spack 12288 Sep 20 13:40 lib64
drwxr-sr-x  6 spack spack  4096 Sep 15 14:36 libexec
drwxr-sr-x  9 spack spack  4096 Sep 15 13:47 llvm
drwxr-sr-x  4 spack spack  4096 Sep 15 13:49 oam
drwxr-sr-x  7 spack spack  4096 Sep 15 13:49 opencl
drwxr-sr-x  4 spack spack  4096 Sep 15 14:36 rccl
drwxr-xr-x  4 spack spack  4096 Sep 15 14:18 rocalution
drwxr-xr-x  4 spack spack  4096 Sep 15 14:09 rocblas
drwxr-xr-x  4 spack spack  4096 Sep 15 14:27 rocfft
-rw-r--r--  1 spack spack  4403 Sep 15 23:02 rocm_setup.sh
drwxr-sr-x  4 spack spack  4096 Sep 15 13:49 rocm_smi
drwxr-xr-x  4 spack spack  4096 Sep 15 14:16 rocprim
drwxr-sr-x  6 spack spack  4096 Sep 15 13:51 rocprofiler
drwxr-xr-x  4 spack spack  4096 Sep 15 14:09 rocrand
drwxr-xr-x  4 spack spack  4096 Sep 15 14:16 rocsolver
drwxr-xr-x  4 spack spack  4096 Sep 15 14:17 rocsparse
drwxr-xr-x  3 spack spack  4096 Sep 15 14:28 rocthrust
drwxr-sr-x  4 spack spack  4096 Sep 15 13:51 roctracer
drwxr-xr-x 11 spack spack  4096 Sep 15 14:28 share
drwxr-sr-x  3 spack spack  4096 Sep 15 13:52 src
drwxr-sr-x  3 spack spack  4096 Sep 15 13:51 test
drwxr-sr-x  3 spack spack  4096 Sep 15 13:53 x86_64-pc-linux-gnu

@dipietrantonio - is this the line you're talking about here ?

I also noticed that the CMAKE_PREFIX_PATH environment variable is not set when the rocm module is loaded ? Perhaps it would also be helpful to define/append to this environment variable a few paths to help out CMake. For ROCm 5.4.3, I believe that rocrand and hiprand configs are at

In the modulefiles, you could add

export CMAKE_PREFIX_PATH=$CMAKE_PREFIX_PATH:${ROCM_INSTALL_DIR}/rocrand/lib/cmake/:${ROCM_INSTALL_DIR}/hiprand/lib/cmake/
dipietrantonio commented 1 year ago

Hi @BenWibking I misread your comment yesterday. What I did to fix this a few months ago is to simply interactively run the line

cmake -DCMAKE_INSTALL_PREFIX=$ROCM_PATH -DHIP_ROOT=$ROCM_PATH -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_C_COMPILER=hipcc -DCMAKE_BUILD_TYPE=Release -DAMDGPU_TARGETS=gfx90a ..

within the build directory created by the scripts. hipRAND was cloned but not installed. @pelahi you did the installation for the PrgEnv-cray, so you know where this build directory is. You should simply move into the hipRAND project folder and run the above line (exporting the required variables). I suspect Pascal added the above line into the scripts but did not actually execute this.

dipietrantonio commented 1 year ago

@fluidnumerics-joe I refer to the line pointed out in #16 . Also, you make a valid point about setting the CMAKE_PREFIX_PATH variable, and in fact it has already been pointed out in #14 . Thanks!

fluidnumerics-joe commented 1 year ago

@dipietrantonio - late to the game on my part ; thanks for filling me in here.

BenWibking commented 1 year ago

I am still getting the same error:

CMake Error at extern/amrex/Tools/CMake/AMReXParallelBackends.cmake:288 (find_package):
  Could not find a package configuration file provided by "hiprand" with any
  of the following names:

    hiprandConfig.cmake
    hiprand-config.cmake

  Add the installation prefix of "hiprand" to CMAKE_PREFIX_PATH or set
  "hiprand_DIR" to a directory containing one of the above files.  If
  "hiprand" provides a separate development package or SDK, be sure it has
  been installed.
Call Stack (most recent call first):
  extern/amrex/Src/CMakeLists.txt:40 (include)
pelahi commented 1 year ago

Hi @BenWibking , so on setonix, if you use the PrgEnv-gnu module list to load rocm/5.4.3 you'll see that the hiprand stuff is there

$ module load rocm/5.4.3
$ lfs find ${HIP_PATH} -name "*hiprand*"
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/libhiprand.so.1.1
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-targets.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-config-version.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-config.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-fortran-config.cmake
/software/setonix/2023.08/pawsey/software/rocm/gcc/12.2.0//rocm-5.4.3rev1/lib64/cmake/hiprand/hiprand-targets-release.cmake

Maybe have a look as to why the cmake macro is not finding it.

BenWibking commented 1 year ago

@pelahi For various reasons, we have to use the PrgEnv-cray environment.

pelahi commented 1 year ago

@BenWibking , the PrgEnv-cray has been updated. Can you try it now?

BenWibking commented 1 year ago

@BenWibking , the PrgEnv-cray has been updated. Can you try it now?

It works now. Thanks.