torch / torch7

http://torch.ch
Other
8.96k stars 2.38k forks source link

Error Installing Torch on Cuda 10 for Ubuntu 16.04 x86_64 #1180

Open spiralswimmer opened 5 years ago

spiralswimmer commented 5 years ago

GPU: Nvidia Tesla P100

Software: Ubuntu 16.04 Arch - x86_64 CUDA - 10.0

Getting this error while installing Torch


Building on 1 cores
-- Found Torch7 in /home/varun/torch/install
-- Removing -DNDEBUG from compile flags
-- TH_LIBRARIES: TH
-- MAGMA not found. Compiling without MAGMA support
-- Autodetected CUDA architecture(s): 6.0
-- got cuda version 10.0
-- Found CUDA with FP16 support, compiling with torch.CudaHalfTensor
-- CUDA_NVCC_FLAGS: -gencode;arch=compute_60,code=sm_60;-DCUDA_HAS_FP16=1
-- THC_SO_VERSION: 0
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_cublas_device_LIBRARY (ADVANCED)
    linked by target "THC" in directory /home/varun/torch/extra/cutorch/lib/THC
FImhaouran commented 5 years ago

you may want to have a look at:https://stackoverflow.com/questions/52501760/cmake-error-while-installing-torch-in-ubuntu

spiralswimmer commented 5 years ago

Some progress but got stuck again here:


> [  1%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o
> /home/varun/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:
>             function "operator==(const __half &, const __half &)"
>             function "operator==(half, half)"
>             operand types are: half == half
> /home/varun/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(414): error: more than one operator "==" matches these operands:
>             function "operator==(const __half &, const __half &)"
>             function "operator==(half, half)"
>             operand types are: half == half
> 2 errors detected in the compilation of "/tmp/tmpxft_00007797_00000000-6_THCTensorMath.cpp1.ii".
> CMake Error at THC_generated_THCTensorMath.cu.o.cmake:267 (message):
>   Error generating file
>   /home/varun/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorMath.cu.o
> lib/THC/CMakeFiles/THC.dir/build.make:2765: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o' failed
> make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o] Error 1
> CMakeFiles/Makefile2:172: recipe for target 'lib/THC/CMakeFiles/THC.dir/all' failed
> make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
> Makefile:127: recipe for target 'all' failed
> make: *** [all] Error 2
> 
ahtonen commented 5 years ago

I've similar problem, but in building Torch for ppc64 architecture. But in the end it fails also to find the same set of libraries. I've CUDA 9.2, but in the log below you can see confusing output from the auto-detection. I read somewhere that this high CUDA is not supported. Have @spiralswimmer you tried to downgrade your CUDA already?

optim 1.0.5-0 is now built and installed in /root/torch/install/ (license: BSD)

Found CUDA on your machine. Installing CUDA packages
Warning: unmatched variable LUALIB

jopts=$(getconf _NPROCESSORS_CONF)

echo "Building on $jopts cores"
cmake -E make_directory build && cd build && cmake .. -DLUALIB= -DCMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS} -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/root/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/root/torch/install/lib/luarocks/rocks/cutorch/scm-1" && make -j$jopts install

Building on 160 cores
-- Found Torch7 in /root/torch/install
-- Removing -DNDEBUG from compile flags
-- TH_LIBRARIES: TH
-- MAGMA not found. Compiling without MAGMA support
-- Autodetected CUDA architecture(s): 6.0 6.0
-- got cuda version 9.2
-- Found CUDA with FP16 support, compiling with torch.CudaHalfTensor
-- CUDA_NVCC_FLAGS: -gencode;arch=compute_60,code=sm_60;-DCUDA_HAS_FP16=1
-- THC_SO_VERSION: 0
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_cublas_LIBRARY (ADVANCED)
    linked by target "THC" in directory /root/torch/extra/cutorch/lib/THC
CUDA_cublas_device_LIBRARY (ADVANCED)
    linked by target "THC" in directory /root/torch/extra/cutorch/lib/THC
CUDA_curand_LIBRARY (ADVANCED)
    linked by target "THC" in directory /root/torch/extra/cutorch/lib/THC

-- Configuring incomplete, errors occurred!
See also "/root/torch/extra/cutorch/build/CMakeFiles/CMakeOutput.log".

Error: Build error: Failed building.
elvis1423 commented 5 years ago

@spiralswimmer For the CUDA_cublas_device_LIBRARY NOTFOUND issue, pls refer to: https://github.com/torch/cutorch/issues/834 ; @ahtonen I solved these NOTFOUND issues by set LD_LIBRARY_PATH=/use/local/cuda-9.2/lib64, and also make sure the lib64 directory has libcublas.so, libcurand.so, libcusparse.so files on AMD64 architecture.

sravyaysk commented 5 years ago

This may possibly give some solution: https://github.com/nagadomi/waifu2x/issues/253#issuecomment-445448928

tajalagawani commented 4 years ago
./install-deps
./clean.sh
./update.sh