hughperkins / cltorch

An OpenCL backend for torch.
Other
289 stars 26 forks source link

error install cltorch on os x 10.11 #72

Closed wpf5511 closed 8 years ago

wpf5511 commented 8 years ago

When install cltorch, make error on os x 10.11

[ 30%] Completed 'clBLAS-external' [ 30%] Built target clBLAS-external Scanning dependencies of target THCl [ 40%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClStorage.cpp.o [ 36%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClTensor.cpp.o [ 40%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClStorageCopy.cpp.o [ 40%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClGeneral.cpp.o [ 40%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClTensorCopy.cpp.o [ 42%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClTensorMath.cpp.o [ 44%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClTensorMathPointwise.cpp.o [ 46%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClReduceApplyUtils.cpp.o [ 48%] Building CXX object src/lib/CMakeFiles/THCl.dir/THClApply.cpp.o /Users/wangpf/torch-cl/opencl/cltorch/src/lib/THClTensorMath.cpp:268:19: error: use of undeclared identifier 'THInf' (float) THInf, self->storage->wrapper)) { ^ /Users/wangpf/torch-cl/opencl/cltorch/src/lib/THClTensorMath.cpp:282:20: error: use of undeclared identifier 'THInf' (float) -THInf, self->storage->wrapper)) { ^ /Users/wangpf/torch-cl/opencl/cltorch/src/lib/THClTensorMath.cpp:290:23: error: use of undeclared identifier 'THInf' float val = (float) THInf; ^ /Users/wangpf/torch-cl/opencl/cltorch/src/lib/THClTensorMath.cpp:298:19: error: use of undeclared identifier 'THInf' (float) THInf, scratch->wrapper)) { ^ /Users/wangpf/torch-cl/opencl/cltorch/src/lib/THClTensorMath.cpp:312:24: error: use of undeclared identifier 'THInf' float val = (float) -THInf; ^ /Users/wangpf/torch-cl/opencl/cltorch/src/lib/THClTensorMath.cpp:320:20: error: use of undeclared identifier 'THInf' (float) -THInf, scratch->wrapper)) { ^ 6 errors generated. make[2]: * [src/lib/CMakeFiles/THCl.dir/THClTensorMath.cpp.o] Error 1 make[2]: * Waiting for unfinished jobs.... make[1]: * [src/lib/CMakeFiles/THCl.dir/all] Error 2 make: * [all] Error 2

Error: Build error: Failed building.

my clang version was:

Apple LLVM version 7.3.0 (clang-703.0.31) Target: x86_64-apple-darwin15.4.0 Thread model: posix

hughperkins commented 8 years ago

Hi. Please can you confirm that you are installing using the torch-cl distro: https://github.com/hughperkins/torch-cl

wpf5511 commented 8 years ago

Cause I previously installed torch,and i looked the torch-cl in your github. Then I find the install-deps are the same,and in the install.sh I found the difference in opencl cause the torch doesn't contain opencl:

cd ${THIS_DIR}/opencl/cltorch && $PREFIX/bin/luarocks make rocks/cltorch-scm-1.rockspec cd ${THIS_DIR}/opencl/clnn && $PREFIX/bin/luarocks make rocks/clnn-scm-1.rockspec

So i extract the two shell command above and excuted it but get the same error compared with luarocks install cltorch

hughperkins commented 8 years ago

The version of torch and nn in distro-cl are different. If you install the exact same versin of torch and nn as in distro-cl, then cltorch will build ok. The appropriate versions are:

torch: https://github.com/hughperkins/torch7/tree/distro-cl nn: https://github.com/hughperkins/nn/tree/distro-cl

wpf5511 commented 8 years ago

Ok,I tried with the torch-cl guide:troch-cl During the installation,didn't report error , But When I run the luajit -l cltorch -e 'cltorch.test()' and luajit -l clnn -e 'clnn.test()' It report follows such as these:

luajit: module 'cltorch' not found: no field package.preload['cltorch'] no file '/Users/wangpf/.luarocks/share/lua/5.1/cltorch.lua' no file '/Users/wangpf/.luarocks/share/lua/5.1/cltorch/init.lua' no file '/Users/wangpf/torch-cl/install/share/lua/5.1/cltorch.lua' no file '/Users/wangpf/torch-cl/install/share/lua/5.1/cltorch/init.lua' no file '/Users/wangpf/torch/install/share/lua/5.1/cltorch.lua' no file '/Users/wangpf/torch/install/share/lua/5.1/cltorch/init.lua' no file './cltorch.lua' no file '/Users/wangpf/torch/install/share/luajit-2.1.0-beta1/cltorch.lua' no file '/usr/local/share/lua/5.1/cltorch.lua' no file '/usr/local/share/lua/5.1/cltorch/init.lua' no file '/Users/wangpf/torch-cl/install/lib/cltorch.dylib' no file '/Users/wangpf/.luarocks/lib/lua/5.1/cltorch.so' no file '/Users/wangpf/torch-cl/install/lib/lua/5.1/cltorch.so'

It seems also the cltorch have not installed .

hughperkins commented 8 years ago

Yes, you are correct that cltorch seems not to have been installed. Can you try the following please, and provide the full output? (eg via https://gist.github.com/ )

source ~/torch-cl/install/bin/torch-activate
cd ~/torch-cl/opencl/cltorch
luarocks make rocks/cltorch-scm-1.rockspec

And also perhaps full output for:

cd ~/torch-cl/opencl/cltorch
source ~/torch-cl/install/bin/torch-activate
./install.sh
wpf5511 commented 8 years ago

Amazing!!! After I run the command again

source ~/torch-cl/install/bin/torch-activate
cd ~/torch-cl/opencl/cltorch
luarocks make rocks/cltorch-scm-1.rockspec

and also

cd ~/torch-cl/opencl/clnn
luarocks make rocks/clnn-scm-1.rockspec

the command works!!!

luajit -l cltorch -e 'cltorch.test()'
luajit -l clnn -e 'clnn.test()'

Thank you for the help!!!

hughperkins commented 8 years ago

Cool :-)

hughperkins commented 8 years ago

Oh, you're right. There's a bug in my install script. I will fix it. Thank you very much for pointing it out :-)

hughperkins commented 8 years ago

(actually, it shows up on travis too https://travis-ci.org/hughperkins/distro-cl/builds/128689798 I just didnt actually look at the travis results :-P )

hughperkins commented 8 years ago

Hopefully should be fixed in the future now :-)

hughperkins commented 8 years ago

(changed travis notification settings so I should find out sooner in the future :-) https://github.com/hughperkins/distro-cl/commit/a7804388f1e02a13c8485d5f02812e6d80e89174 )

hughperkins commented 8 years ago

(Hmmm, actually, i still hadnt fixed it. Fixed now though :-) https://travis-ci.org/hughperkins/distro-cl/builds/129667418 )

tylerlindell commented 8 years ago

I just tried this and couldn't get the following two tests to work:

luajit -l cltorch -e 'cltorch.test()'
luajit -l clnn -e 'clnn.test()'

Following along with this thread I did

source ~/torch-cl/install/bin/torch-activate
cd ~/torch-cl/opencl/cltorch
luarocks make rocks/cltorch-scm-1.rockspec

I tried the following

cd ~/torch-cl/opencl/cltorch
source ~/torch-cl/install/bin/torch-activate
./install.sh

and it says

bash: ./install.sh: No such file or directory
hughperkins commented 8 years ago

Hi Tyler,

Can you provide the output of the following please?

cd ~/torch-cl
git log -n 3 --oneline
git status
git submodule
./install.sh

(that will likely produce a ton of output, so you could paste into a https://gist.github.com )

tylerlindell commented 8 years ago

git log -n 3 --oneline

9da3e6e remember to add test script for cutorch-cltorch clobbering
96bda1a prevent cutorch clobbering cltorch, if d afterwards
fafd384 code highlighting in cltorch readme, and links to cutorch-rtc

git status

On branch distro-cl
Your branch is up-to-date with 'origin/distro-cl'.
nothing to commit, working directory clean

git submodule

40d6fa037e52dca9eab161ab8f1e5496a52c3c1e exe/env (heads/master)
3a64814592231e6557f4cacbd2e42cd0caec7939 exe/luajit-rocks (remotes/origin/threads-39-g3a64814)
c077ce7b429a1e4d9fc8c96dd7115b06ea8bbe46 exe/qtlua (heads/master)
f2d2f1ad40d5dc695d75c4ef2edd36142d319e49 exe/trepl (heads/master)
e3ac739d86b295bbfa057cd1e82d9c6faec7edba extra/FindCUDA (v3.5-1)
aa0f434b6673b8713c54154bc9f8d8f095be20df extra/argcheck (1.0.0-0-50-gaa0f434)
e71fce7f248be8a4937a1de76dae0883fe4fc454 extra/audio (heads/master)
f1cfa7ca2d379ac9c336147a59b45fbf2039ffbf extra/cudnn (f1cfa7c)
8164557ad1745e04e452f5b47f08c5f6854c5862 extra/cunn (iguana-7-g8164557)
48eb977a8d4ab0ddfcd8ead6f72b6b17ad77326b extra/cunnx (remotes/origin/state-13-g48eb977)
e38e54f421c672998f7a86e93d30158dc57afce9 extra/cutorch (remotes/origin/distro-cl)
ce8883364201b0741d3d51a0f305f457778a6671 extra/fftw3 (heads/master)
6ab76730aecc6fcaa4c2abf4b4b921c5bf0c902a extra/graph (0.1-53-g6ab7673)
80638b6d93466170bbd9e085f57ea305cd37b34d extra/graphicsmagick (80638b6)
eb052fb2e9b641564dd68b1e27f64af4f593f128 extra/iTorch (eb052fb)
d59326b2d718e1a140b9b396ffe0a557b2d93fe0 extra/lua-cjson (2.1.0-2-gd59326b)
44672a3a7bdfc81980abfc95160fb396ca6eb2cd extra/luaffifb (44672a3)
9cbb967b55c5cd052a8a42e0315a51d5c68c04d7 extra/luafilesystem (v_1_6_3-30-g9cbb967)
dc7a020a3b735d7f4491da163bf39f78cde73f94 extra/nn (iguana-3-gdc7a020)
ccc9627a95972eca32915100ceddddcfe6e87f43 extra/nngraph (0.1-118-gccc9627)
08706fa21c6df745756538d1cd47dd8c1a8c20b1 extra/nnx (0.1-0-204-g08706fa)
16d149338af9efc910528641c5240c5641aeb8db extra/penlight (1.3.2-80-g16d1493)
50659fbeca83d667240b197298a0462c7ec0ad21 extra/sdl2 (heads/master)
94290c5297c25aaf76e93a1a9ff050da600af1df extra/signal (heads/master)
9e31123db00c8c8c437a3cba744576456de38731 extra/threads (remotes/origin/lua53-1-g9e31123)
29705fa97588071d25f289760c795d5ddc36e9a1 opencl/clnn (iguana-71-g29705fa)
695ca5f8265d0351900ca0340b458439c9ec621d opencl/cltorch (iguana-96-g695ca5f)
2939ae702ba4b76ff73794db52e63e532e1e3687 pkg/cwrap (remotes/origin/koraykv-license-1-6-g2939ae7)
1b36900e1bfa6ee7f48db52c577bdeb7d9e85909 pkg/dok (heads/master)
c15fdaae50b74f087ec7eb12ff2ec33f4b227415 pkg/gnuplot (c15fdaa)
5daa4afc01a8ce67eed6a44aacbbe5be68568bbd pkg/image (1.0.1-0-233-g5daa4af)
76db06f9895e6a94cb3df372fcd771cbcad4a599 pkg/optim (1.0.3-0-167-g76db06f)
68d579a2d3b1b0bb03a11637632e6e699b14ad80 pkg/paths (heads/master)
ba5b5a143482857f80237181d5fde0a3ba20477b pkg/qttorch (heads/master)
7539726a1706afd23f0a072642b14af4fc01edf6 pkg/sundown (7539726)
77f10a2b95f30a08e9a439532c508632b7893f79 pkg/sys (1.1-0-11-g77f10a2)
05b0ac49b487bba8deed45925a35b7689516a859 pkg/torch (remotes/origin/distro-cl)
3c1d3c9aaa4f7c0c2ad84ae54a154eee596019c0 pkg/xlua (1.0-0-29-g3c1d3c9)

./install.sh output here

hughperkins commented 8 years ago

Ok, that all looks good. Can you provide also the output of:

source ~/torch-cl/install/bin/torch-activate
luajit -l cltorch -e 'cltorch.test()'
tylerlindell commented 8 years ago

Thank you for your help on this! I really appreciate it. here is the output of that:

    luajit: ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: cannot load '/Users/tylerlindell/torch-cl/install/lib/lua/5.1/libtorch.so'
stack traceback:
    [C]: in function 'require'
    ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: in main chunk
    [C]: in function 'require'
    ...rlindell/torch-cl/install/share/lua/5.1/cltorch/init.lua:1: in main chunk
    [C]: at 0x010fadbdb0
    [C]: at 0x010fa5fce0
hughperkins commented 8 years ago

Ok. Can you provide the full output pleaes? Think it's missing a line or two?

tylerlindell commented 8 years ago
Tylers-MacBook-Pro:torch-cl tylerlindell$ luajit -l cltorch -e 'cltorch.test()'
luajit: ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: cannot load '/Users/tylerlindell/torch-cl/install/lib/lua/5.1/libtorch.so'
stack traceback:
    [C]: in function 'require'
    ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: in main chunk
    [C]: in function 'require'
    ...rlindell/torch-cl/install/share/lua/5.1/cltorch/init.lua:1: in main chunk
    [C]: at 0x0101f14db0
    [C]: at 0x0101e98ce0
Tylers-MacBook-Pro:torch-cl tylerlindell$ 
hughperkins commented 8 years ago

Oh, it's a Mac. Hmmm. It's probably related to some combination of rpath and/or System Integrity Protection. As far as System Integrity Protection, which blocks loading dynamic objects from certain paths, one cheap hack which sometimes works is to copy the dynamic objects into ~/lib, eg something like:

mkdir ~/lib
cp ~/torch-cl/install/lib/* ~/lib

... and try again. Otherwise we'll probably need to do some rpath diagnostic stuff. For example something like:

otool -L ~/torch-cl/install/lib/lua/5.1/libcltorch.so
otool -l ~/torch-cl/install/lib/lua/5.1/libcltorch.so | grep RPATH -A2

... might give some initial hints

tylerlindell commented 8 years ago

I was able to get it all copied over with

mkdir ~/lib
cp -rf ~/torch-cl/install/lib/ ~/lib

then ran

source ~/torch-cl/install/bin/torch-activate
luajit -l cltorch -e 'cltorch.test()'

and got the same error

Tylers-MacBook-Pro:torch-cl tylerlindell$ luajit -l cltorch -e 'cltorch.test()'
luajit: ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: cannot load '/Users/tylerlindell/torch-cl/install/lib/lua/5.1/libtorch.so'
stack traceback:
    [C]: in function 'require'
    ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: in main chunk
    [C]: in function 'require'
    ...rlindell/torch-cl/install/share/lua/5.1/cltorch/init.lua:1: in main chunk
    [C]: at 0x010955fdb0
    [C]: at 0x01094e3ce0
Tylers-MacBook-Pro:torch-cl tylerlindell$ 

I then ran

otool -L ~/torch-cl/install/lib/lua/5.1/libcltorch.so

Output

Users/tylerlindell/torch-cl/install/lib/lua/5.1/libcltorch.so:
    @rpath/libluaT.dylib (compatibility version 0.0.0, current version 0.0.0)
    @rpath/libTHCl.dylib (compatibility version 0.0.0, current version 0.0.0)
    @rpath/libEasyCL.dylib (compatibility version 0.0.0, current version 0.0.0)
    @rpath/libclBLAS.2.dylib (compatibility version 2.0.0, current version 2.11.0)
    @rpath/libTH.dylib (compatibility version 0.0.0, current version 0.0.0)
    libmkl_intel_lp64.dylib (compatibility version 0.0.0, current version 0.0.0)
    libmkl_intel_thread.dylib (compatibility version 0.0.0, current version 0.0.0)
    libmkl_core.dylib (compatibility version 0.0.0, current version 0.0.0)
    libiomp5.dylib (compatibility version 5.0.0, current version 5.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
    @rpath/libclew.1.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.1.0)

Followed that up with

otool -l ~/torch-cl/install/lib/lua/5.1/libcltorch.so | grep RPATH -A2

Output

cmd LC_RPATH
cmdsize 40
     path @executable_path/../lib (offset 12)
--
      cmd LC_RPATH
 cmdsize 56
     path /Users/tylerlindell/torch-cl/install/lib (offset 12)

I noticed that everything was still trying to access the file in ~/torch-cl/install/lib/ instead of in ~/lib so I updated ~/torch-cl/install/bin/torch-activate. Changing LUA_CPATH to point at ~/lib

From there I went through and tested everything again the same way and saw the same issues

hughperkins commented 8 years ago

I wonder if it's to do with those libmkl libraries? they're probably in your anaconda directory?

hughperkins commented 8 years ago

(maybe libiomp5 too)

hughperkins commented 8 years ago

Can you try copying those three libmkl libraries, and libiomp5 library into ~/lib (they're probalby in your anaconda/lib directory), and try again?

hughperkins commented 8 years ago

Actually, I just realized. You dont say you are using anaconda. Are you using anaconda and/or do you have anaconda installed?

(Edit: just to be clear, having anaconda installed doesnt make installation easier, only harder, so if you dont have anaconda installed, dont suddenly install it :-D )

hughperkins commented 8 years ago

Oh... you have anaconda in a direcotry anaconda3:

/Users/tylerlindell/anaconda3

The script removes anything with anaconda2 in it, but not anaconda3. I might update the script to handle anaconda3, and then everything tshould work better. hopefully

hughperkins commented 8 years ago

Updated scirpt in https://github.com/hughperkins/distro-cl/commit/93abdc9cdcce8d5a1e9ac6ca513f2ca8cdba7a93 , to ignore anaconda3 in the path and so on. Can you trry the following pleaes?

cd ~/torch-cl
git pull
rm -Rf build
./install.sh
luajit -l cltorch -e 'cltorch.test()'
tylerlindell commented 8 years ago

After following those instructions here is the output of luajit -l cltorch -e 'cltorch.test()'

luajit: ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: cannot load '/Users/tylerlindell/lib/lua/5.1/libtorch.so'
stack traceback:
    [C]: in function 'require'
    ...lerlindell/torch-cl/install/share/lua/5.1/torch/init.lua:13: in main chunk
    [C]: in function 'require'
    ...rlindell/torch-cl/install/share/lua/5.1/cltorch/init.lua:1: in main chunk
    [C]: at 0x010844edb0
    [C]: at 0x01083d2ce0

I have not moved those libmkl libraries yet

hughperkins commented 8 years ago

Oh heh! But .. I think we're close. Probably. Maybe. Can yo utry the following please?

cd ~/torch-cl
rm -Rf ~/torch-cl/pkg/torch/build
./install.sh
source ~/torch-cl/install/bin/torch-activate
luajit -l cltorch -e 'cltorch.test()'

If that doesnt work, maybe just reclone, and reinstall:

cd ~
mv torch-cl torch-cl.old
git clone --recursive https://github.com/hughperkins/distro-cl ~/torch-cl
cd torch-cl
./install.sh
luajit -l cltorch -e 'cltorch.test()'

If that doesnt work, we should continue digging a bit on rpath :-P

tylerlindell commented 8 years ago

The first one worked! Thank you very much for your help on this.

hughperkins commented 8 years ago

Cool! Thats good news :-) Thank you for your patience :-)

alex3s commented 7 years ago

Hi, ~/torch-cl/opencl/cltorch/src/clMathLibraries/clBLAS/src/library/blas/xgemm.cc:173:12: error: ‘thread’ before ‘static’ reversing the order of ‘thread’ and ‘static’ solved the problem.