xsdk-project / xsdk-issues

A repository under which GitHub issues not related to a specific xSDK repo can be filed.
7 stars 0 forks source link

Hypre cannot find cublas/cusparse/curand on Perlmutter #207

Closed pghysels closed 1 year ago

pghysels commented 1 year ago

On Perlmutter, with GCC and with the system's CUDA.

$ spack install hypre+cuda cuda_arch=80
==> cray-mpich@8.1.17 : has external module in ['cray-mpich/8.1.17', 'libfabric']
[+] /opt/cray/pe/mpich/8.1.17/ofi/gnu/9.1 (external cray-mpich-8.1.17-wo7t6ge5s4tsni4lbnazm2obvj2mv57t)
==> cuda@11.7 : has external module in ['cudatoolkit/11.7']
[+] /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/cuda/11.7 (external cuda-11.7-qv33igm66yr7q42hcsrlcgmim7e25kuf)
[+] /usr (external perl-5.26.1-3te4migtm33gp7xhuvnlgevwrqkzuflh)
[+] /pscratch/sd/p/pghysels/spack-xsdk/opt/spack/linux-sles15-zen3/gcc-11.2.0/openblas-0.3.21-hioz4ypfraws55rzphvhnzt5kahideeo
==> Installing hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow
==> No binary for hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow found: installing from source
==> Using cached archive: /pscratch/sd/p/pghysels/spack-xsdk/var/spack/cache/_source-cache/archive/c2/c214084bddc61a06f3758d82947f7f831e76d7e3edeac2c78bb82d597686e05d.tar.gz
==> No patches needed for hypre
==> hypre: Executing phase: 'autoreconf'
==> hypre: Executing phase: 'configure'
==> hypre: Executing phase: 'build'
==> Error: ProcessError: Command exited with status 2:
    'make' '-j16'

6 errors found in build log:
     1380    make[1]: Leaving directory '/tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/distri
             buted_ls'
     1381    
     1382    Making lib ...
     1383    make[1]: Entering directory '/tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/lib'
     1384    Building libHYPRE.so ...
     1385    /opt/cray/pe/mpich/8.1.17/ofi/gnu/9.1/bin/mpicxx -shared  -o libHYPRE-2.26.0.so   /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0
             -mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/IJ_mv/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw
             5xvlt2zow/spack-src/src/distributed_ls/Euclid/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zo
             w/spack-src/src/distributed_ls/ParaSails/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spa
             ck-src/src/distributed_ls/pilut/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/sr
             c/distributed_matrix/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/krylov/*.
             o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/matrix_matrix/*.o /tmp/pghysels/
             spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/multivector/*.o /tmp/pghysels/spack-stage/spack-
             stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/parcsr_ls/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-m
             vnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/parcsr_mv/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6
             bw5xvlt2zow/spack-src/src/parcsr_block_mv/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/sp
             ack-src/src/seq_mv/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/sstruct_ls/
             *.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/sstruct_mv/*.o /tmp/pghysels/s
             pack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/struct_ls/*.o /tmp/pghysels/spack-stage/spack-sta
             ge-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/struct_mv/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnm
             pfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/utilities/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5
             xvlt2zow/spack-src/src/blas/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/la
             pack/*.o /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/IJ_mv/*.obj /tmp/pghysels
             /spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/parcsr_ls/*.obj /tmp/pghysels/spack-stage/spack
             -stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/parcsr_mv/*.obj /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.
             0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/seq_mv/*.obj /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzek
             c6bw5xvlt2zow/spack-src/src/sstruct_ls/*.obj /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spa
             ck-src/src/sstruct_mv/*.obj /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/struct
             _ls/*.obj /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/struct_mv/*.obj /tmp/pgh
             ysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/utilities/*.obj  -L/opt/cray/pe/mpich/8.1.
             17/ofi/gnu/9.1/lib   -L/pscratch/sd/p/pghysels/spack-xsdk/opt/spack/linux-sles15-zen3/gcc-11.2.0/openblas-0.3.21-hioz4ypfraws55rzphv
             hnzt5kahideeo/lib  -lopenblas -L/pscratch/sd/p/pghysels/spack-xsdk/opt/spack/linux-sles15-zen3/gcc-11.2.0/openblas-0.3.21-hioz4ypfra
             ws55rzphvhnzt5kahideeo/lib  -lopenblas -lm  -L/opt/nvidia/hpc_sdk/Linux_x86_64/22.5/cuda/11.7/lib64 -lcudart -lcusparse -lcublas -lc
             urand          -Wl,-soname,libHYPRE-2.26.0.so -Wl,-z,defs
  >> 1386    /usr/bin/ld: cannot find -lcusparse
  >> 1387    /usr/bin/ld: cannot find -lcublas
  >> 1388    /usr/bin/ld: cannot find -lcurand
  >> 1389    collect2: error: ld returned 1 exit status
  >> 1390    make[1]: *** [Makefile:129: libHYPRE.so] Error 1
     1391    make[1]: Leaving directory '/tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-src/src/lib'
  >> 1392    make: *** [Makefile:86: all] Error 1

See build log for details:
  /tmp/pghysels/spack-stage/spack-stage-hypre-2.26.0-mvnmpfan7a5xebthzekc6bw5xvlt2zow/spack-build-out.txt

To use systems cuda libraries, I have in ~/.spack/packages.yaml:

  cuda:
    buildable: false
    externals:
    - spec: cuda@11.7
      modules:
      - cudatoolkit/11.7

On Perlmutter, the CUDA math libs are in a separate folder from the main CUDA libraries: /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/cuda/11.7/lib64 /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/math_libs/11.7/lib64

spack-build-out.txt

osborn9 commented 1 year ago

On Perlmutter, the CUDA math libs are in a separate folder from the main CUDA libraries: /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/cuda/11.7/lib64 /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/math_libs/11.7/lib64

This will cause issues as we make assumptions on the location of the CUDA libs (e.g., all located in the same place under ${CUDA_HOME}). We have a workaround for this when manually building hypre, but not a fix in spack.

pghysels commented 1 year ago

Maybe we just need to: export LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/22.5/math_libs/11.7/lib64:$LIBRARY_PATH I'll try that soon. The Perlmutter scratch filesystem seems to be down at the moment.

pghysels commented 1 year ago

that didn't work

pghysels commented 1 year ago

I tested that this is fixed with this PR: https://github.com/hypre-space/hypre/pull/788