beagle-dev / beagle-lib

general purpose library for evaluating the likelihood of sequence evolution on trees
MIT License
131 stars 57 forks source link

Compile ERROR: Value 'compute_13' is not defined for option 'gpu-architecture' #82

Open jba462 opened 9 years ago

jba462 commented 9 years ago

I am trying to install the beagle library on Ubuntu 14.04. I have installed Cuda-7 and am running an N-Vidia GeForce GTX 650 graphics card.

The configuration step seems to work up until the actual installation. It seems that compute_13 is no longer supported in Cuda 7, but that is the hard-coded architecture option in the makefile. I tried altering the -arch setting to sm_30 in the makefile at the level 'libhmsbeagle/GPU/kernels/' and the installation process completed, but then the 'make check' step failed.

How can I get around this?

I have included the output from the attempted compilation below:

~$ ./autogen.sh

libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, .config'. libtoolize: copying file.config/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR, m4'. libtoolize: copying filem4/libtool.m4' libtoolize: copying file m4/ltoptions.m4' libtoolize: copying filem4/ltsugar.m4' libtoolize: copying file m4/ltversion.m4' libtoolize: copying filem4/lt~obsolete.m4' configure.ac:60: installing '.config/compile' configure.ac:66: installing '.config/config.guess' configure.ac:66: installing '.config/config.sub' configure.ac:58: installing '.config/install-sh' configure.ac:58: installing '.config/missing' Makefile.am: installing './INSTALL' examples/complextest/Makefile.am: installing '.config/depcomp' parallel-tests: installing '.config/test-driver'

~$ ./configure --prefix=$HOME

checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking for style of include used by make... GNU checking dependency style of gcc... gcc3 checking for g++... g++ checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking dependency style of g++... gcc3 checking whether ln -s works... yes checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking how to print strings... printf checking for a sed that does not truncate output... /bin/sed checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for fgrep... /bin/grep -F checking for ld used by gcc... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B checking the name lister (/usr/bin/nm -B) interface... BSD nm checking the maximum length of command line arguments... 1572864 checking whether the shell understands some XSI constructs... yes checking whether the shell understands "+="... yes checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop checking for /usr/bin/ld option to reload object files... -r checking for objdump... objdump checking how to recognize dependent libraries... pass_all checking for dlltool... no checking how to associate runtime and link libraries... printf %s\n checking for ar... ar checking for archiver @FILE support... @ checking for strip... strip checking for ranlib... ranlib checking command to parse /usr/bin/nm -B output from gcc object... ok checking for sysroot... no checking for mt... mt checking if mt is a manifest tool... no checking how to run the C preprocessor... gcc -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for dlfcn.h... yes checking for objdir... .libs checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC -DPIC checking if gcc PIC flag -fPIC -DPIC works... yes checking if gcc static flag -static works... yes checking if gcc supports -c -o file.o... yes checking if gcc supports -c -o file.o... (cached) yes checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... no checking how to run the C++ preprocessor... g++ -E checking for ld used by g++... /usr/bin/ld -m elf_x86_64 checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking for g++ option to produce PIC... -fPIC -DPIC checking if g++ PIC flag -fPIC -DPIC works... yes checking if g++ static flag -static works... yes checking if g++ supports -c -o file.o... yes checking if g++ supports -c -o file.o... (cached) yes checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking dynamic linker characteristics... (cached) GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking for library containing lt_dlinit... -lltdl checking if gcc accepts -dumpversion option... yes checking gcc version... 4.8 checking for /usr/include/CL... yes checking for /usr/local/cuda/... yes checking for /usr/local/cuda/include... yes checking for /usr/local/cuda/lib... no checking for nvcc... /usr/local/cuda-7.0/bin/nvcc checking for x86 cpuid output... unknown checking for x86 cpuid 0x00000001 output... 306c3:7100800:7ffafbff:bfebfbff checking whether mmx is supported... yes checking whether sse is supported... yes checking whether sse2 is supported... yes checking whether sse3 is supported... yes checking whether ssse3 is supported... yes checking whether sse4.1 is supported... yes checking whether sse4.2 is supported... yes checking whether avx is supported... yes checking whether C compiler accepts -mmmx... yes checking whether C compiler accepts -msse... yes checking whether C compiler accepts -msse2... yes checking whether C compiler accepts -msse3... yes checking whether C compiler accepts -mssse3... yes checking whether C compiler accepts -msse4.1... yes checking whether C compiler accepts -msse4.2... yes checking whether C compiler accepts -mavx... yes checking cpuid.h usability... yes checking cpuid.h presence... yes checking for cpuid.h... yes checking for javac... javac checking for javac... /usr/bin/javac checking symlink for /usr/bin/javac... /etc/alternatives/javac checking symlink for /etc/alternatives/javac... /usr/lib/jvm/java-7-openjdk-amd64/bin/javac checking for doxygen... no configure: WARNING: doxygen not found - will not generate any doxygen documentation checking for perl... /usr/bin/perl checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating hmsbeagle-1.pc config.status: creating libhmsbeagle/Makefile config.status: creating libhmsbeagle/GPU/Makefile config.status: creating libhmsbeagle/GPU/kernels/Makefile config.status: creating libhmsbeagle/CPU/Makefile config.status: creating libhmsbeagle/plugin/Makefile config.status: creating libhmsbeagle/JNI/Makefile config.status: creating examples/Makefile config.status: creating examples/tinytest/Makefile config.status: creating examples/complextest/Makefile config.status: creating examples/oddstatetest/Makefile config.status: creating examples/fourtaxon/Makefile config.status: creating examples/genomictest/Makefile config.status: creating examples/matrixtest/Makefile config.status: creating libhmsbeagle/config.h config.status: executing depfiles commands config.status: executing libtool commands

~$ make install

Making install in libhmsbeagle make[1]: Entering directory /home/joseph/bin/beagle-lib-master/libhmsbeagle' Making install in GPU make[2]: Entering directory/home/joseph/bin/beagle-lib-master/libhmsbeagle/GPU' Making install in kernels make[3]: Entering directory /home/joseph/bin/beagle-lib-master/libhmsbeagle/GPU/kernels' echo "// auto-generated header file with CUDA kernels PTX code" > BeagleCUDA_kernels.h /usr/local/cuda-7.0/bin/nvcc -o BeagleCUDA_kernels.ptx -ptx -DCUDA -DSTATE_COUNT=4 \ ./kernels4.cu -O3 -DHAVE_CONFIG_H -I/home/joseph/bin/beagle-lib-master -I/home/joseph/bin/beagle-lib-master echo "#define KERNELS_STRING_SP_4 \"" | sed 's/$/\\n\\/' >> BeagleCUDA_kernels.h cat BeagleCUDA_kernels.ptx | sed 's/\"/\\"/g' | sed 's/$/\\n\\/' >> BeagleCUDA_kernels.h echo "\"" >> BeagleCUDA_kernels.h for s in 16 32 48 64 80 128 192; do \ echo "Making state count = $s" ; \ (/usr/local/cuda-7.0/bin/nvcc -o BeagleCUDA_kernels.ptx -ptx -DCUDA -DSTATE_COUNT=$s \ ./kernelsX.cu -O3 -DHAVE_CONFIG_H -I/home/joseph/bin/beagle-lib-master -I/home/joseph/bin/beagle-lib-master) || exit; \ echo "#define KERNELS_STRING_SP_$s \"" | sed 's/$/\\n\\/' >> BeagleCUDA_kernels.h; \ cat BeagleCUDA_kernels.ptx | sed 's/\"/\\"/g' | sed 's/$/\\n\\/' >> BeagleCUDA_kernels.h; \ echo "\"" >> BeagleCUDA_kernels.h; \ done Making state count = 16 Making state count = 32 Making state count = 48 Making state count = 64 Making state count = 80 Making state count = 128 Making state count = 192 /usr/local/cuda-7.0/bin/nvcc -o BeagleCUDA_kernels.ptx -ptx -arch compute_13 -DCUDA -DSTATE_COUNT=4 -DDOUBLE_PRECISION \ ./kernels4.cu -O3 -DHAVE_CONFIG_H -I/home/joseph/bin/beagle-lib-master -I/home/joseph/bin/beagle-lib-master nvcc fatal : Value 'compute_13' is not defined for option 'gpu-architecture' make[3]: *** [BeagleCUDA_kernels.h] Error 1 make[3]: Leaving directory/home/joseph/bin/beagle-lib-master/libhmsbeagle/GPU/kernels' make[2]: * [install-recursive] Error 1 make[2]: Leaving directory `/home/joseph/bin/beagle-lib-master/libhmsbeagle/GPU' make[1]: * [install-recursive] Error 1 make[1]: Leaving directory`/home/joseph/bin/beagle-lib-master/libhmsbeagle' make: *\ [install-recursive] Error 1

josephwb commented 9 years ago

Also having this problem with cuda 7.0. nvcc info:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Mon_Feb_16_22:59:02_CST_2015
Cuda compilation tools, release 7.0, V7.0.27

Seems like cuda 7 does not support compute_13:

nvcc --help
...
--gpu-architecture <arch>                  (-arch)                           
        Specify the name of the class of NVIDIA 'virtual' GPU architecture for which
        the CUDA input files must be compiled.
        With the exception as described for the shorthand below, the architecture
        specified with this option must be a 'virtual' architecture (such as compute_20).
        Normally, this option alone does not trigger assembly of the generated PTX
        for a 'real' architecture (that is the role of nvcc option '--gpu-code',
        see below); rather, its purpose is to control preprocessing and compilation
        of the input to PTX.
        For convenience, in case of simple nvcc compilations, the following shorthand
        is supported.  If no value for option '--gpu-code' is specified, then the
        value of this option defaults to the value of '--gpu-architecture'.  In this
        situation, as only exception to the description above, the value specified
        for '--gpu-architecture' may be a 'real' architecture (such as a sm_20),
        in which case nvcc uses the specified 'real' architecture and its closest
        'virtual' architecture as effective architecture values.  For example, 'nvcc
        --gpu-architecture=sm_20' is equivalent to 'nvcc --gpu-architecture=compute_20
        --gpu-code=sm_20,compute_20'.
 -->    Allowed values for this option:  'compute_20','compute_30','compute_32',
        'compute_35','compute_37','compute_50','compute_52','compute_53','sm_20',
        'sm_21','sm_30','sm_32','sm_35','sm_37','sm_50','sm_52','sm_53'.

As far as I can see, compute_13 is hardcoded in beagle-lib/libhmsbeagle/GPU/kernels/Makefile.[am/in]. After running configure, you can manually edit beagle-lib/libhmsbeagle/GPU/kernels/Makefile, changing compute_13 to something else (compute_20?). Things compile without complaining, but fail genomictest during make check.

It is not clear to me yet which compute_xy is approrpriate.

msuchard commented 9 years ago

Removing -arch compute_13 from the 2 lines in Makefile.am referenced above, then starting over from ./autogen.sh successfully builds and passes make check on my Ubuntu 14.04 + CUDA 7 + K40 system.

The original intent of the compute_13 flags was to permit compilation of the double-precision kernels. Much older GPUs will certainly still need the flag even though it now throws an error with CUDA 7. I'll discuss end-of-lifing pre-compute_20 hardware with @ayresdl and @rambaut.

josephwb commented 9 years ago

Thanks @msuchard. I tried that as well, but am still failing make check. Maybe it is because I have an older card?

lspci | grep -i nvidia
05:00.0 VGA compatible controller: NVIDIA Corporation GF106GL [Quadro 2000] (rev a1)
05:00.1 Audio device: NVIDIA Corporation GF106 High Definition Audio Controller (rev a1)
msuchard commented 9 years ago

The Quadro 2000 has compute compatibility 2.1 [https://developer.nvidia.com/cuda-legacy-gpus], so your compilation should be fine without specifying -arch (which defaults to 2.0). Am not sure why make check fails. But, more importantly, does BEAST or MrBayes using BEAGLE work for you? Given your card, you should have both single- and double-precision support.