torch / distro

Torch installation in a self-contained folder
BSD 3-Clause "New" or "Revised" License
556 stars 484 forks source link

Failed installation when running './install.sh' #239

Open patrickacole opened 7 years ago

patrickacole commented 7 years ago

The following are the error messages received after running './install.sh'

1 error detected in the compilation of "/tmp/tmpxft_000005a7_00000000-6_THCTensorMathPairwise.cpp1.ii".
CMake Error at THC_generated_THCTensorMathPairwise.cu.o.cmake:267 (message):
  Error generating file
  /home/vipaacc/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorMathPairwise.cu.o

lib/THC/CMakeFiles/THC.dir/build.make:4216: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/home/vipaacc/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

/home/vipaacc/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(414): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

2 errors detected in the compilation of "/tmp/tmpxft_00000605_00000000-6_THCTensorMath.cpp1.ii".
CMake Error at THC_generated_THCTensorMath.cu.o.cmake:267 (message):
  Error generating file
  /home/vipaacc/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorMath.cu.o

lib/THC/CMakeFiles/THC.dir/build.make:2735: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o] Error 1
CMakeFiles/Makefile2:172: recipe for target 'lib/THC/CMakeFiles/THC.dir/all' failed
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

Error: Build error: Failed building.
msiraj83 commented 7 years ago

I also face the same error. can anybody help torch_error

Thanks

busrakb commented 7 years ago

@msiraj83 Did u solve this problem? I am taking the same error.

pkuwwt commented 7 years ago

./clean.sh export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" ./install.sh

busrakb commented 7 years ago

Thank you so much, @pkuwwt

matanhs commented 6 years ago

did not solve the problem for me, also cuda

this was finally solved for me by editing install.sh- commenting out the install lines for cutorch and cunn then run the installation script. After it is done went in the individual folders under torch/extra and used git pull to get recent versions of cutorch and cunn finally installed them manually with luarocks

nk0307 commented 6 years ago

same problem. have u guys solved it?

JSkalskiSBG commented 6 years ago

@nk0307 pkuwwt post fixed it for me:

./clean.sh
export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
./install.sh

On google groups someone posted that downgrading CUDA from 9.1 to 8.0 also fix this problem.

subzerofun commented 6 years ago

@JSkalskiSBG

Thanks for the fix, but I did get stuck at the same point again... here is what i did:

and building cutorch stopped again at 21% ... this is my third attempt now. I've also tried running the install like this (from https://github.com/torch/cutorch/issues/797#issuecomment-364602210 @thompa2):

TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" ./install.sh

I use Cuda compilation tools, release 9.0, V9.0.175, cudnn 7 with a GTX 1080 Ti on macOS 10.12. PyTorch and Tensorflow are working great with GPU acc., so i doubt there are issues with my CUDA install or drivers.


@matanhs Will try your fix to install cutorch and cunn after the main torch install. Didn't you run into the same problem when you installed them manually? Which CUDA version are you using? Hope i don't have to revert back to CUDA 8.0 – now that i have set up PyTorch and TF with 9.0 ...


Just copying my errors for people searching on Google (so that the exact output gets indexed) – maybe someone will post an alternative fix after my comment.

warning: 'THCudaHalfTensor_medianall' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_medianall(THCState * state, THCudaHalfTensor * self); 
                ^
9 warnings generated.
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

Error: 4242 f💩ckups generated. Critical damage to human neural net. Shutting down brain 😖. 
matanhs commented 6 years ago

It's been a while and I don't remember exactly the process I went through, the release/master branch should allow you to install without an issue.

subzerofun commented 6 years ago

@matanhs I have tried the latest branch numerous times in the last few hours and also the alternative method you have suggested, but for some reason the flag TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" gets completely ignored. I get the same warning/error messages with or without it.

When i try to compile cutorch & cunn these two lines are contradicting each other:

-- Found CUDA with FP16 support, compiling with torch.CudaHalfTensor
-- CUDA_NVCC_FLAGS: -D__CUDA_NO_HALF_OPERATORS__;-D__CUDA_NO_HALF_OPERATORS__;-gencode;arch=compute_61,code=sm_61;-DCUDA_HAS_FP16=1

compiling with torch.CudaHalfTensor

and then

-D__CUDA_NO_HALF_OPERATORS__

i don't get this. It should either compile with "half tensors" or without them. Don't know how deep i have to dig to disable the torch.CudaHalfTensor ...

Looks like i have to revert back to CUDA 8, because 9.1 isn't working either from what i've read.

I'm afraid the scripts that parse your GPU for CUDA specs can't deal with cards that have a different architecture. Some script probably detects one feature for my GTX 1080 Ti and then it tries to compile it on my GTX 780. Have also tried installing with CUDA_VISIBLE_DEVICES=0 so that it only detects the 1080 Ti, but that also doesn't work. Don't know what to try anymore... Installing Tensorflow, although it's not even officially supported for Mac since version 1.2, was a breeze compared to this.

biojazzard commented 6 years ago

Same Here with OSX 10.13 Cuda 9.1... Any news?

matanhs commented 6 years ago

check nvcc version might be old. Personally I just use the docker development image from nvidia with cuda 9 + cudnn 7, then install Torch from source works for me without an issue (also running 1080ti)

biojazzard commented 6 years ago

My machine: MBP mid 2012 | OSX 10.13 latest | GT750M |

My gcc:

$ gcc -v
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Found CUDA installation: /usr/local/cuda, version unknown

My nvcc:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Tue_Dec_19_21:36:29_CST_2017
Cuda compilation tools, release 9.1, V9.1.128

I have been hours and hours with this... I have tried all. I have tried hacks to solve this "half" issue. Nothing. I give up.

uthynauta commented 6 years ago

Post from @pkuwwt worked for me, thanks.

Harry-675 commented 6 years ago

Doesn't work for me,any other suggestions?

ot6nyu commented 6 years ago

this was finally solved for me by editing install.sh- commenting out the install lines for cutorch and cunn then run the installation script. After it is done went in the individual folders under torch/extra and used git pull to get recent versions of cutorch and cunn finally installed them manually with luarocks

@matanhs This was the only method which worked for me, but to be specific, I needed to do: export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" right before installing replaced cutorch and cunn with luarocks.

yuyifan1991 commented 5 years ago

I have the following problem,who know how to solve it: CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: CUDA_cublas_device_LIBRARY (ADVANCED) linked by target "THC" in directory /root/torch/extra/cutorch/lib/THC