Closed mlajtos closed 9 years ago
Hmmm. If it segfaults on load, thats pretty early. Not much has happened by now. Can you confirm the output from clinfo and gpuinfo pleaes? clinfo should come with your opencl system. gpuinfo.... on my system its installed in ~/torch/install/bin, so you should just be able to run it.
I didn't have clinfo, so I used one from [http://graphics.stanford.edu/~yoel/notes/clInfo.c](). gpuinfo is also missing and can't find any suitable replacement.
Output of clinfo:
Found 1 platform(s).
platform[0x7fff0000]: profile: FULL_PROFILE
platform[0x7fff0000]: version: OpenCL 1.2 (Feb 27 2015 01:29:10)
platform[0x7fff0000]: name: Apple
platform[0x7fff0000]: vendor: Apple
platform[0x7fff0000]: Found 2 device(s).
Device number 1
device[0xffffffff]: NAME: Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz
device[0xffffffff]: VENDOR: Intel
device[0xffffffff]: PROFILE: FULL_PROFILE
device[0xffffffff]: VERSION: OpenCL 1.2
device[0xffffffff]: DRIVER_VERSION: 1.1
device[0xffffffff]: Type: CPU
device[0xffffffff]: EXECUTION_CAPABILITIES: Kernel Native
device[0xffffffff]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2)
device[0xffffffff]: CL_DEVICE_LOCAL_MEM_TYPE: Global (2)
device[0xffffffff]: SINGLE_FP_CONFIG: 0xbf
device[0xffffffff]: QUEUE_PROPERTIES: 0x2
device[0xffffffff]: VENDOR_ID: 4294967295
device[0xffffffff]: MAX_COMPUTE_UNITS: 4
device[0xffffffff]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0xffffffff]: MAX_WORK_GROUP_SIZE: 1024
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2
device[0xffffffff]: MAX_CLOCK_FREQUENCY: 1800
device[0xffffffff]: ADDRESS_BITS: 64
device[0xffffffff]: MAX_MEM_ALLOC_SIZE: 1073741824
device[0xffffffff]: IMAGE_SUPPORT: 1
device[0xffffffff]: MAX_READ_IMAGE_ARGS: 128
device[0xffffffff]: MAX_WRITE_IMAGE_ARGS: 8
device[0xffffffff]: IMAGE2D_MAX_WIDTH: 8192
device[0xffffffff]: IMAGE2D_MAX_HEIGHT: 8192
device[0xffffffff]: IMAGE3D_MAX_WIDTH: 2048
device[0xffffffff]: IMAGE3D_MAX_HEIGHT: 2048
device[0xffffffff]: IMAGE3D_MAX_DEPTH: 2048
device[0xffffffff]: MAX_SAMPLERS: 16
device[0xffffffff]: MAX_PARAMETER_SIZE: 4096
device[0xffffffff]: MEM_BASE_ADDR_ALIGN: 1024
device[0xffffffff]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0xffffffff]: GLOBAL_MEM_CACHELINE_SIZE: 3145728
device[0xffffffff]: GLOBAL_MEM_CACHE_SIZE: 64
device[0xffffffff]: GLOBAL_MEM_SIZE: 4294967296
device[0xffffffff]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0xffffffff]: MAX_CONSTANT_ARGS: 8
device[0xffffffff]: LOCAL_MEM_SIZE: 32768
device[0xffffffff]: ERROR_CORRECTION_SUPPORT: 0
device[0xffffffff]: PROFILING_TIMER_RESOLUTION: 1
device[0xffffffff]: ENDIAN_LITTLE: 1
device[0xffffffff]: AVAILABLE: 1
device[0xffffffff]: COMPILER_AVAILABLE: 1
Device number 2
device[0x1024400]: NAME: HD Graphics 4000
device[0x1024400]: VENDOR: Intel
device[0x1024400]: PROFILE: FULL_PROFILE
device[0x1024400]: VERSION: OpenCL 1.2
device[0x1024400]: DRIVER_VERSION: 1.2(Mar 27 2015 01:47:22)
device[0x1024400]: Type: GPU
device[0x1024400]: EXECUTION_CAPABILITIES: Kernel
device[0x1024400]: GLOBAL_MEM_CACHE_TYPE: None (0)
device[0x1024400]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
device[0x1024400]: SINGLE_FP_CONFIG: 0xbe
device[0x1024400]: QUEUE_PROPERTIES: 0x2
device[0x1024400]: VENDOR_ID: 16925696
device[0x1024400]: MAX_COMPUTE_UNITS: 16
device[0x1024400]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0x1024400]: MAX_WORK_GROUP_SIZE: 512
device[0x1024400]: PREFERRED_VECTOR_WIDTH_CHAR: 1
device[0x1024400]: PREFERRED_VECTOR_WIDTH_SHORT: 1
device[0x1024400]: PREFERRED_VECTOR_WIDTH_INT: 1
device[0x1024400]: PREFERRED_VECTOR_WIDTH_LONG: 1
device[0x1024400]: PREFERRED_VECTOR_WIDTH_FLOAT: 1
device[0x1024400]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0
device[0x1024400]: MAX_CLOCK_FREQUENCY: 1150
device[0x1024400]: ADDRESS_BITS: 64
device[0x1024400]: MAX_MEM_ALLOC_SIZE: 268435456
device[0x1024400]: IMAGE_SUPPORT: 1
device[0x1024400]: MAX_READ_IMAGE_ARGS: 128
device[0x1024400]: MAX_WRITE_IMAGE_ARGS: 8
device[0x1024400]: IMAGE2D_MAX_WIDTH: 16384
device[0x1024400]: IMAGE2D_MAX_HEIGHT: 16384
device[0x1024400]: IMAGE3D_MAX_WIDTH: 2048
device[0x1024400]: IMAGE3D_MAX_HEIGHT: 2048
device[0x1024400]: IMAGE3D_MAX_DEPTH: 2048
device[0x1024400]: MAX_SAMPLERS: 16
device[0x1024400]: MAX_PARAMETER_SIZE: 1024
device[0x1024400]: MEM_BASE_ADDR_ALIGN: 1024
device[0x1024400]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0x1024400]: GLOBAL_MEM_CACHELINE_SIZE: 0
device[0x1024400]: GLOBAL_MEM_CACHE_SIZE: 0
device[0x1024400]: GLOBAL_MEM_SIZE: 1073741824
device[0x1024400]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0x1024400]: MAX_CONSTANT_ARGS: 8
device[0x1024400]: LOCAL_MEM_SIZE: 65536
device[0x1024400]: ERROR_CORRECTION_SUPPORT: 0
device[0x1024400]: PROFILING_TIMER_RESOLUTION: 80
device[0x1024400]: ENDIAN_LITTLE: 1
device[0x1024400]: AVAILABLE: 1
device[0x1024400]: COMPILER_AVAILABLE: 1
Hmmm, looks pretty convincing...
I've created a debug version, that prints out some small debuggin statements. Can you try installing and running this please? Something like:
git clone --recursive https://github.com/hughperkins/cltorch.git -b debug cltorch-debug
cd cltorch-debug
luarocks make rocks/cltorch-scm-1.rockspec
th -l cltorch
On my computer, I get the output:
user@pear:~/git/cltorch$ th -l cltorch
init.lua START
init.lua loading libcltorch...
luaopen_libcltorch START
luaopen_libcltorch storage init
luaopen_libcltorch tensor init
luaopen_libcltorch tensormath init
luaopen_libcltorch tensoroperator init
luaopen_libcltorch END
... libcltorch loaded
loading Tensor.lua...
loading Random.lua...
init.lua END
Depending on where it crashes, more or less of this will be missing.
Well, pretty early:
$ th -l cltorch
init.lua START
init.lua loading libcltorch...
luaopen_libcltorch START
Segmentation fault: 11
Is there any way to control execution flow, i.e. debug in Lua scripts?
you can do
gdb luajit
run
require 'cltorch'
backtrace
@szagoruyko, thank you.
th> require 'cltorch'
init.lua START
init.lua loading libcltorch...
luaopen_libcltorch START
Process 60809 stopped
* thread #1: tid = 0x203dbe, 0x0000000002311068 libEasyCL.dylib`lua_createtable(L=0x0000000000079378, narray=0, nrec=0) + 24 at lapi.c:580, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
frame #0: 0x0000000002311068 libEasyCL.dylib`lua_createtable(L=0x0000000000079378, narray=0, nrec=0) + 24 at lapi.c:580
577
578 LUA_API void lua_createtable (lua_State *L, int narray, int nrec) {
579 lua_lock(L);
-> 580 luaC_checkGC(L);
581 sethvalue(L, L->top, luaH_new(L, narray, nrec));
582 api_incr_top(L);
583 lua_unlock(L);
Cool. Can you type 'bt', and paste the output?
I've got the same segfault, it fails here https://github.com/hughperkins/cltorch/blob/master/init.cpp#L157
Ok. That line is not the actual root cause reason, right? Something in the lua, earlier? https://github.com/hughperkins/cltorch/blob/master/init.lua ? Or, something odd about the way the library is built somehow?
(like, eg cutorch has the same first line in fact: https://github.com/torch/cutorch/blob/master/init.c#L763 )
Hmmm, maybe something to do with some mismatch between the lua library linked with during build, and the one linked with at runtime? eg different versions?
It's getting lua from EasyCL. Maybe that's not a good sign?
@szagoruyko I guess when we are building EasyCL for torch, we should persuade it to link with libluajit.so, rather than with its own internal lua perhaps?
user@pear:~/git/cltorch$ grep newtable EasyCL/thirdparty/lua-5.1.5/src/*.h
EasyCL/thirdparty/lua-5.1.5/src/lua.h:#define lua_newtable(L) lua_createtable(L, 0, 0)
user@pear:~/git/cltorch$ nm ~/torch/install/lib/libluajit.so | grep createtable
0000000000068290 T lua_createtable
user@pear:~/git/cltorch$ nm ~/torch/install/lib/libEasyCL.so | grep createtable
000000000002bff0 T lua_createtable
yes that's the issue probably
Any idea where the link_libraries(libluajit.so)
statement is? Hunting around in ~/torch/install/share/cmake/torch , but didnt find it yet.
Ok. Can you try the following please:
git clone --recursive https://github.com/hughperkins/cltorch.git -b new-clew cltorch-nolua
cd cltorch-nolua
luarocks make rocks/cltorch-scm-1.rockspec
Oh wait, this branch combines two 'fixes', and the first fix doesnt quite work. I'll let you know once this branch is ready to try.
Note: the other issue in this branch, the missing opencl include files, should be fixed now. So, the only open issue is the lua-building bit. Per the other thread, using new-clew branch gives an error about missing lua imports?
Hi guys, do you mind seeing if you are able to build the following repo?
git clone https://github.com/hughperkins/prototyping-lualink.git
cd prototyping-lualink
./run.sh
I dont have a Mac, so I'm unsure how to build lua modules in a cross-platform way. This works for me on ubuntu 14.04 64-bit.
/Users/milanlajtos/Downloads/prototyping-lualink/mylib.cpp:5:12: fatal error: 'lua.h' file not found
#include "lua.h"
^
1 error generated.
make[2]: *** [CMakeFiles/mylib.dir/mylib.cpp.o] Error 1
make[1]: *** [CMakeFiles/mylib.dir/all] Error 2
make: *** [all] Error 2
luajit: test.lua:1: module 'mylib' not found:
no field package.preload['mylib']
no file './mylib.lua'
no file '/Users/milanlajtos/torch/install/share/luajit-2.1.0-alpha/mylib.lua'
no file '/usr/local/share/lua/5.1/mylib.lua'
no file '/usr/local/share/lua/5.1/mylib/init.lua'
no file '/Users/milanlajtos/torch/install/share/lua/5.1/mylib.lua'
no file '/Users/milanlajtos/torch/install/share/lua/5.1/mylib/init.lua'
no file 'build/libmylib.so'
stack traceback:
[C]: in function 'require'
test.lua:1: in main chunk
[C]: at 0x0107188630
Changing path in CMakeLists.txt resulted in this:
CMake Warning (dev):
Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake
--help-policy CMP0042" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
MACOSX_RPATH is not specified for the following targets:
mylib
This warning is for project developers. Use -Wno-dev to suppress it.
-- Generating done
-- Build files have been written to: /Users/milanlajtos/Downloads/prototyping-lualink/build
Scanning dependencies of target mylib
[100%] Building CXX object CMakeFiles/mylib.dir/mylib.cpp.o
Linking CXX shared library libmylib.dylib
Undefined symbols for architecture x86_64:
"_luaL_register", referenced from:
_luaopen_mylib in mylib.cpp.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libmylib.dylib] Error 1
make[1]: *** [CMakeFiles/mylib.dir/all] Error 2
make: *** [all] Error 2
luajit: test.lua:1: module 'mylib' not found:
no field package.preload['mylib']
no file './mylib.lua'
no file '/Users/milanlajtos/torch/install/share/luajit-2.1.0-alpha/mylib.lua'
no file '/usr/local/share/lua/5.1/mylib.lua'
no file '/usr/local/share/lua/5.1/mylib/init.lua'
no file '/Users/milanlajtos/torch/install/share/lua/5.1/mylib.lua'
no file '/Users/milanlajtos/torch/install/share/lua/5.1/mylib/init.lua'
no file 'build/libmylib.so'
stack traceback:
[C]: in function 'require'
test.lua:1: in main chunk
[C]: at 0x010197b630
Hi. Thanks! Can you git git pull
, and retry please?
(by the way, relevant page I'm using to help figure htis out: http://lua-users.org/wiki/BuildingModules )
Hmmm, looks like what I might need is what's in TorchPackage.cmake, ie https://github.com/torch/torch7/blob/master/cmake/TorchPackage.cmake#L20 :
IF(APPLE)
SET_TARGET_PROPERTIES(${package} PROPERTIES
LINK_FLAGS "-undefined dynamic_lookup")
ENDIF()
Hi, updated to use the SET_TARGET_PROPERTIES, as per TorchPackage.cmake. Can you git pull
, and send the output of running please?
(Alternatively, I've gone ahead and updated EasyCL etc with this linker option, so you could just try to reinstall cltorch, from the new-clew branch, and then trying the prototype only if the main instlal still fails)
(Edited to specify should use new-clew
branch)
(Sorry, I was away.)
I've tried new-clew branch and build failed with this:
Linking CXX shared library libTHCl.dylib
Undefined symbols for architecture x86_64:
"___clewCreateBuffer", referenced from:
CLKernel* CLKernel::input<TensorInfoCl>(int, TensorInfoCl const*) in THClKernels.cpp.o
CLKernel* CLKernel::inout<TensorInfoCl const>(int, TensorInfoCl const*) in THClKernels.cpp.o
CLKernel* CLKernel::output<TensorInfoCl const>(int, TensorInfoCl const*) in THClKernels.cpp.o
CLKernel* CLKernel::input<TensorInfoCl>(int, TensorInfoCl const*) in THClGather.cpp.o
CLKernel* CLKernel::input<TensorInfoCl>(int, TensorInfoCl const*) in THClScatter.cpp.o
"___clewReleaseEvent", referenced from:
_THClBlas_gemv in THClBlas.cpp.o
_THClBlas_ger in THClBlas.cpp.o
_THClBlas_gemm in THClBlas.cpp.o
"___clewSetKernelArg", referenced from:
CLKernel* CLKernel::input<TensorInfoCl>(int, TensorInfoCl const*) in THClKernels.cpp.o
CLKernel* CLKernel::inout<TensorInfoCl const>(int, TensorInfoCl const*) in THClKernels.cpp.o
CLKernel* CLKernel::output<TensorInfoCl const>(int, TensorInfoCl const*) in THClKernels.cpp.o
CLKernel* CLKernel::input<TensorInfoCl>(int, TensorInfoCl const*) in THClGather.cpp.o
CLKernel* CLKernel::input<TensorInfoCl>(int, TensorInfoCl const*) in THClScatter.cpp.o
"___clewWaitForEvents", referenced from:
_THClBlas_dot in THClBlas.cpp.o
_THClBlas_gemv in THClBlas.cpp.o
_THClBlas_ger in THClBlas.cpp.o
_THClBlas_gemm in THClBlas.cpp.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [lib/THCl/libTHCl.dylib] Error 1
make[1]: *** [lib/THCl/CMakeFiles/THCl.dir/all] Error 2
make: *** [all] Error 2
Error: Build error: Failed building.
Also pulled changes for prototyping-lualink and everything went as supposed:
Scanning dependencies of target mylib
[100%] Building CXX object CMakeFiles/mylib.dir/mylib.cpp.o
Linking CXX shared library libmylib.dylib
[100%] Built target mylib
hello!
hello!
Hello! Same error here. Will try to look more into it as well!
Hi. Ok, good news that the lua-linking problem is fixed :-) The clew linking is a bit curious. I would have thought that it would be sufficient to link with EasyCL, which already contains these symbols, at least kind of unbound (not sure what is the technical term for this?), but maybe on mac, unbound symbols from one so are not considered 'good enough' for linking with another?
On ubuntu I have the following results from nm
, on the relevant so's:
user@pear:~/torch/install/lib$ nm libEasyCL.so | grep clew
U __clewBuildProgram
U __clewCreateBuffer
U __clewCreateCommandQueue
U __clewCreateContext
U __clewCreateKernel
U __clewCreateProgramWithSource
U __clewEnqueueCopyBuffer
U __clewEnqueueNDRangeKernel
U __clewEnqueueReadBuffer
U __clewEnqueueWriteBuffer
U __clewFinish
U __clewGetDeviceIDs
U __clewGetDeviceInfo
U __clewGetPlatformIDs
U __clewGetPlatformInfo
U __clewGetProgramBuildInfo
U clewInit
U __clewReleaseCommandQueue
U __clewReleaseContext
U __clewReleaseEvent
U __clewReleaseKernel
U __clewReleaseMemObject
U __clewReleaseProgram
U __clewSetKernelArg
U __clewWaitForEvents
user@pear:~/torch/install/lib$ nm libTHCl.so | grep clew
U __clewCreateBuffer
U __clewReleaseEvent
U __clewSetKernelArg
U __clewWaitForEvents
ie, __clewSetKernelArg
is in both, and unbound in both. At linktime, for libTHCl.dylib, __clewSetKernelArg
comes from libEasyCL.dylib. At runtime, __clewSetKernelArg
will come from clew:
user@pear:~/torch/install/lib$ ldd libEasyCL.so
linux-vdso.so.1 => (0x00007fff5a1c2000)
libclew.so.1.0.0 => /home/user/torch/install/lib/libclew.so.1.0.0 (0x00007f68468aa000)
...
user@pear:~/torch/install/lib$ nm libclew.so | grep clew | grep "Arg"
00000000002051d8 B __clewSetKernelArg
clew
is a proxy for opencl library, so we dont need to bind with it either during build, or during loading at runtime, just bind at the time we actually want to use it. Actually it's somewhat optional to use clew if we are creating a module that will only ever be used in the presence of an opencl-enabled device, as for cltorch, but anyway, I use it everywhere, and it works well for me, so I till now saw no reason to not use it :-)
I'm going to guess that on a mac, we have two options:
-undefined dynamic_lookup
linker option, as for luaI'm currently tentatively leaning towards adding in the -undefined dynamic_lookup
linker option, which will hopefully squelch all the other platform-specific errors I've been seeing on mac builds.
Ok, so I've updated libTHCl.so to link also with libclew.so:
user@pear:~/git/cltorch$ ldd ~/torch/install/lib/libTHCl.so
...
libEasyCL.so => /home/user/torch/install/lib/../lib/libEasyCL.so (0x00007ffbb5015000)
libclew.so.1.0.0 => /home/user/torch/install/lib/../lib/libclew.so.1.0.0 (0x00007ffbb4e0f000)
...
Can you pull down the latest version from new-clew branch, (ie git pull
), and try again?
(Hmmm, I contradicted myself, and did it the opposite way I said I would do :-P Anyway...)
new-clew works for me:
th> cltorch.getDeviceProperties(1)
{
deviceType : "CPU"
maxClockFrequency : 2300
deviceName : "Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz"
maxMemAllocSizeMB : 4096
globalMemCachelineSizeKB : 6144
deviceVersion : "OpenCL 1.2 "
localMemSizeKB : 32
openClCVersion : "OpenCL C 1.2 "
maxWorkGroupSize : 1024
globalMemSizeMB : 16384
platformVendor : "Apple"
maxComputeUnits : 8
}
[0.0006s]
th> cltorch.getDeviceProperties(2)
{
deviceType : "GPU"
maxClockFrequency : 1200
deviceName : "HD Graphics 4000"
maxMemAllocSizeMB : 256
globalMemCachelineSizeKB : 0
deviceVersion : "OpenCL 1.2 "
localMemSizeKB : 64
openClCVersion : "OpenCL C 1.2 "
maxWorkGroupSize : 512
globalMemSizeMB : 1024
platformVendor : "Apple"
maxComputeUnits : 16
}
[0.0002s]
th> cltorch.getDeviceProperties(3)
{
deviceType : "GPU"
maxClockFrequency : 774
deviceName : "GeForce GT 650M"
maxMemAllocSizeMB : 128
globalMemCachelineSizeKB : 0
deviceVersion : "OpenCL 1.2 "
localMemSizeKB : 48
openClCVersion : "OpenCL C 1.2 "
maxWorkGroupSize : 1024
globalMemSizeMB : 512
platformVendor : "Apple"
maxComputeUnits : 2
}
Pulled changes, build was successful, loading failed:
th> require 'cltorch'
...s/milanlajtos/torch/install/share/lua/5.1/trepl/init.lua:319: ...milanlajtos/torch/install/share/lua/5.1/cltorch/init.lua:19: cannot load '/Users/milanlajtos/torch/install/lib/lua/5.1/libcltorch.so'
stack traceback:
[C]: in function 'error'
...s/milanlajtos/torch/install/share/lua/5.1/trepl/init.lua:319: in function 'f'
[string "local f = function() return require 'cltorch'..."]:1: in main chunk
[C]: in function 'xpcall'
...s/milanlajtos/torch/install/share/lua/5.1/trepl/init.lua:583: in function 'repl'
...jtos/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
[C]: at 0x0107ae4630
@szagoruyko Ok, cool :-) @mlajtos hmmm, ok, pondering...
@mlajtos : can you confirm that other libraries load for you, eg require 'nn'
?
Wait, fish is not properly loading some paths, bash is okay. Fixed! Thank you very much, @hughperkins! :)
Ok, cool :-) So, no more seg fault, and builds ok, and at least loads ok now?
It loads and it seems it works as it should:
th> c = torch.ClTensor{7,4,5}
Using Apple platform: Apple
Using device: Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz
[0.0040s]
th> c
7
4
5
[torch.ClTensor of size 3]
:)
Cool :-)
@hughperkins how do you run tests? would be nice to have cltorch.test()
as in cutorch
I do:
./run-unit-tensor.sh
... but you're right that I should ... oh... I see ... I dont have to migrate my tests to another testing api, simply add in a cltorch.test()
function? Seems doable. Will ponder this.
Ok. Merged to master, and seems that installing via luarocks install cltorch
works ok, at least on my particular platform => closing.
run-unit-tensor.sh does not exist, to I tried run-test-storage.sh and got following:
$ ./run-test-storage.sh
./run-test-storage.sh: line 5: /Users/milanlajtos/torch/activate: No such file or directory
However, I managed to use cltorch with Karpathy's char-rnn. Got ton of errors, and was twice slower as CPU, but it worked.
Again, thank you for your effort. You rock!
@mlajtos yes, char-rnn runs on opencl, but is not super speedy yet. One step at a time :-)
(By the way, what errors do you get when you run char-rnn? (For me it runs clean, just not very speedy) )
cltorch and clnn builds fine now, but running char-rnn doesn't work at all
the ouput is the following
Exowide:char-rnn mnemonis$ th train.lua -data_dir data/tinyshakespeare/ -opencl 1
registering spatialconvolutionmm
using OpenCL on GPU 0...
loading data files...
cutting off end of data so that the batches/sequences divide evenly
reshaping tensor...
data load done. Number of data batches in train: 423, val: 23, test: 0
vocab size: 65
creating an LSTM with 2 layers
Using Apple platform: Apple
Using device: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
statefultimer v0.6
THClApply.cl build log:
@Ambext Ah, thanks. I can probably fix this (I mean, I can.. and probalby should), but ... it looks like you are running this on the cpu? I mean, not on the cpu's internal graphics gpu, but on the actual 'cpu' bit of the cpu. cltorch doesnt target such a scenario at all. I mean, it probably could, but I think it's probably better to get things running ok-ish on the 'gpu' part of cpus first. Can you try choosing a different device, using the -gpuid
option?
(Note: you can view which device is which, by doing, th -l cltorch
, and then cltorch.getDeviceProperties(1)
, cltorch.getDeviceProperties(2)
, etc. I think you need to subtract one from this number, when you call into char-rnn's train.lua.
(Edit, basically, in the bit where it says Using device: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
, I would expect it to say instead something like using device: HD Graphics 4000
, as per szagoruyko's example above
(hmmm, perhaps I should make getDeviceProperties, and setDevice only expose GPU-type devices, including integrated graphics, but not including cpu-cpu bit. Seems no obvious reason for exposing unusable bits in fact?)
referring to char-rnn "If you'd like to use OpenCL GPU computing, you'll first need to install the cltorch and clnn packages, and then use the option -opencl 1 during training"
any gpuid call (besides gpuid -1) yields a cutorch / cunn not installed error any opencl X with X >= 2 just executes with no error or message ( I assume it just falls back on the CPU) opencl 1 returns the error mentioned above.
First, thank you for fixing the build for OS X.
Now, loading cltorch on mac fails with segfault:
Don't know how to get the stack trace.