Unable to build a C++ Hadrons application for GPU

edbennett commented 2 years ago

I'm trying to use Hadrons on Tursa, but am not able to compile an app.

What I have tried:

Configure Grid using:

../configure \
    --prefix ${HOME}/prefix2 \
    --enable-comms=mpi \
    --enable-simd=GPU \
    --enable-shm=nvlink \
    --enable-gen-simd-width=64 \
    --enable-accelerator=cuda \
    --enable-Nc=2 \
    --enable-accelerator-cshift \
    --disable-unified \
    --disable-gparity \
    --with-lime=${HOME}/prefix \
    CXX=nvcc \
    LDFLAGS="-cudart shared" \
    CXXFLAGS="-ccbin mpicxx -gencode arch=compute_80,code=sm_80 -std=c++14 -cudart shared"

Then make and install

Configure Hadrons using

../configure --with-grid=${HOME}/prefix2 --prefix=${HOME}/prefix2

Then make and install.

Follow the instructions in the app tutorial to build a test app. Configure using
```
../configure --with-grid=${HOME}/prefix2 --with-hadrons=${HOME}/prefix2
```

The configure step fails, as it sets CXX=g++ rather than the nvcc that was used for compiling Grid and Hadrons. This conflicts with the CXXFLAGS which are designed to work with nvcc, and are not recognised by g++.

configure:4458: g++ -o conftest  -g -O2 -I/home/dp208/dp208/dc-benn2/prefix2/include -I/home/dp208/dp208/dc-benn2/prefix2/include -I/home/dp208/dp208/dc-benn2/prefix2/include -I/home/dp208/dp208/dc-benn2/prefix/include -O3 -ccbin mpicxx -gencode arch=compute_80,code=sm_80 -std=c++14 -cudart shared -Xcompiler -fno-strict-aliasing --expt-extended-lambda --expt-relaxed-constexpr -Xcompiler -fopenmp    -L/home/dp208/dp208/dc-benn2/prefix2/lib -L/home/dp208/dp208/dc-benn2/prefix2/lib -L/home/dp208/dp208/dc-benn2/prefix2/lib -L/home/dp208/dp208/dc-benn2/prefix/lib -cudart shared -Xcompiler -fopenmp conftest.cpp  -lHadrons  -ldl -lGrid -lz -lcrypto -llime -lmpfr -lgmp -lstdc++ -lm -lcuda -lz >&5
g++: error: mpicxx: No such file or directory
g++: error: unrecognized debug output level 'encode'
g++: error: arch=compute_80,code=sm_80: No such file or directory
g++: error: shared: No such file or directory
g++: error: shared: No such file or directory
g++: error: unrecognized command line option '-ccbin'
g++: error: unrecognized command line option '-cudart'
g++: error: unrecognized command line option '-Xcompiler'; did you mean '--compile'?
g++: error: unrecognized command line option '--expt-extended-lambda'
g++: error: unrecognized command line option '--expt-relaxed-constexpr'
g++: error: unrecognized command line option '-Xcompiler'; did you mean '--compile'?
g++: error: unrecognized command line option '-cudart'
g++: error: unrecognized command line option '-Xcompiler'; did you mean '--compile'?

If I manually set CXX=nvcc, then I get a huge number of errors, where nvcc doesn't recognise the CUDA functions and datatypes; the first couple are, for example:

/home/dp208/dp208/dc-benn2/prefix2/include/Grid/threads/Accelerator.h:109:8: error: 'cudaStream_t' does not name a type; did you mean 'CUstream_st'?
 extern cudaStream_t copyStream;
        ^~~~~~~~~~~~
        CUstream_st
/home/dp208/dp208/dc-benn2/prefix2/include/Grid/threads/Accelerator.h:106:28: error: '__host__' does not name a type; did you mean 'CUhostFn'?
 #define accelerator_inline __host__ __device__ inline

Manually setting CXXFLAGS="" doesn't make any difference, as configure picks up the additional CXXFLAGS from hadrons-config.

Am I missing a step or flags that would allow me to configure to compile for GPU, or is this not currently supported?

Many thanks

aportelli commented 2 years ago

Hi @edbennett, sorry this is a bit tedious to forward nvcc flags. This is completely supported and we are using it in production. On Tursa I use

../../configure --with-grid=${HOME}/local/grid --prefix=${HOME}/local/grid CC=gcc CXX='nvcc -x cu'

where you need to change the paths to what is relevant for you. I'll try to improve that in the future, but would also welcome PRs for that 😉.

edbennett commented 2 years ago

Thanks Antonin, that has let me progress. Worth noting in case anyone else stumbles across this issue that this severely breaks the CXXLD step (it tries to interpret the binary as source code, and then outputs garbage to stdout and stderr seemingly indefinitely). I used make -n and then copied and pasted the CXXLD line, and re-ran it after removing the -x cu argument, and it at least attempted to link.

If I have some spare time I'll see if I can make the process any smoother, but can't offer a timeline for that.

aportelli commented 2 years ago

@edbennett no worries and thanks for the info. Is that ok to close the issue?

edbennett commented 2 years ago

👍

aportelli / Hadrons

Unable to build a C++ Hadrons application for GPU #84