hughperkins / cltorch

An OpenCL backend for torch.
Other
289 stars 26 forks source link

char-rnn: model_utils.lua:76: bad argument #1 to ‘set' #7

Closed Ambext closed 8 years ago

Ambext commented 9 years ago

Coming from issue #5

Ambext commented 9 years ago

Exowide:~ mnemonis$ th -l cltorch -e "require 'cltorch.unit_tensor'; tester = torch.Tester(); tester.countasserts = 0; cltorch.tests.tensor.test_apply()" Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine statefultimer v0.6 left 2.7412 2.5659 3.3092 2.2399 2.7899 2.9085 3.0640 3.0911 2.6912 2.2286 2.8776 3.4564 2.5604 2.2797 2.3517 3.0109 3.2104 2.4476 3.1951 2.8707 3.1436 2.4851 2.2896 3.3255 [torch.DoubleTensor of size 6x4]

right 0.0071 -0.1354 0.3848 -0.4443 0.0441 0.1303 0.2353 0.2528 -0.0321 -0.4562 0.1084 0.4666 -0.1401 -0.4031 -0.3312 0.2004 0.3268 -0.2403 0.3175 0.1035 0.2859 -0.2062 -0.3930 0.3942 [torch.DoubleTensor of size 6x4]

diff 2.7341 2.7013 2.9243 2.6842 2.7457 2.7781 2.8287 2.8384 2.7233 2.6848 2.7692 2.9898 2.7005 2.6828 2.6829 2.8105 2.8837 2.6879 2.8776 2.7672 2.8577 2.6913 2.6826 2.9313 [torch.DoubleTensor of size 6x4]

left 2.7412 2.5659 3.3092 2.2399 2.7899 2.9085 3.0640 3.0911 2.6912 2.2286 2.8776 3.4564 2.5604 2.2797 2.3517 3.0109 3.2104 2.4476 3.1951 2.8707 3.1436 2.4851 2.2896 3.3255 [torch.DoubleTensor of size 6x4]

right 0.0071 -0.1354 0.3848 -0.4443 0.0441 0.1303 0.2353 0.2528 -0.0321 -0.4562 0.1084 0.4666 -0.1401 -0.4031 -0.3312 0.2004 0.3268 -0.2403 0.3175 0.1035 0.2859 -0.2062 -0.3930 0.3942 [torch.DoubleTensor of size 6x4]

diff 2.7341 2.7013 2.9243 2.6842 2.7457 2.7781 2.8287 2.8384 2.7233 2.6848 2.7692 2.9898 2.7005 2.6828 2.6829 2.8105 2.8837 2.6879 2.8776 2.7672 2.8577 2.6913 2.6826 2.9313 [torch.DoubleTensor of size 6x4]

Ambext commented 9 years ago

Apologies but I will only be online for some more minutes and then disconnect for quite a while, maybe even until Tuesday afternoon AEST Sydney Time.

hughperkins commented 9 years ago

:-) Thank-you for all your hard work. Ah, AEST is Australia? So we are almost in the same timezone actually. (I'm in Singapore/Hong Kong/Beijing time).

Ambext commented 9 years ago

my pleasure. I wish I could grant you remote access so you could iterate more rapidly, as in all honesty I am only typing your commands...

Maybe creating a user with restricted rights and allowing remote access?

AEST = Australian Eastern Standard Time My timezone is also defined by two under three kids...

Off for a bit. Cheers.

hughperkins commented 9 years ago

Apparently docker is quite secure, as long as you dont grant me root access. I dont have any experience in creating a docker container though. I think it's a bit like creating a chroot. I think the only tricky bit would be, you'd have to install the driver inside, and connect the inside to the GPU somehow.

hughperkins commented 9 years ago

I guess if it was me, I wouldnt trust docker very much, I'd probably create a 'throwaway' operating system, by eg installing to a usb key, and disabling the hard drives in bios, something like that, and then just grant root.

hughperkins commented 9 years ago

(well, I might not grant root, but note that if you install to usb, it writes a lot of temp stuff, unless you take pains to stop that, eg by assigning tmpfs to /var/tmp, and creating a soft link from /home/user/.cache to another tmpfs area, stuff like that, both of which need root really.)

hughperkins commented 9 years ago

Hmmm, I wonder how much the issues are related to the specific grpahics card, and how much to the driver and operating system. Probalby a bit of both. So ... docker I guess :-(

hughperkins commented 9 years ago

(hmmm, or maybe just a normal restricted user perhaps?)

hughperkins commented 9 years ago

So, what I reckon is going on is that probably I'm copying data from the gpu to the host, without having first waited for the calculations to finish. I do this for two reasons:

  1. firstly, I'm trying to make char-rnn run fast on certain gpus, and I notice that adding in sync calls carries a very high overhead, eg can increase runtime easily by 200%. So I removed them all :-P Now, on my gpu apparently that works fine, and on other people's too, but not on yours
  2. I should probably read through the manual really, to find what are the exact conditions under which synching is/isnt required. But I havent had time :-P

The easiest way forward for me personally would be to try adding in sync calls until the tests pass on your gpu.

Having said that, what we should probably try is simply putting sync calls everywhere, and check the tests pass.

That's easy to do. The functionality si already present. Simply call cltorch.setAddFinish(1), and then a sync call will be added basically after every gpu kernel launch. It will be very slow :-P but hopefully the tests will pass. To do this, I think you just need to do:

th -l cltorch -e 'cltorch.setAddFinish(1); cltorch.test()'
Ambext commented 9 years ago

Hey Hugh, all good thanks. The test didn't pass without errors. Here is a link to the complete log https://www.cubbyusercontent.com/pli/150710+Sync+test+Terminal+Saved+Output.txt/_ec365ae279db47248fd52466578036bf

hughperkins commented 9 years ago

Wow, it's totally broken, on your machine, even with the syncs in. That's interesting. Pondering...

hughperkins commented 9 years ago

Ok. Let's start with something simple (I'm going to work in 30 minutes by the way):

th -l cltorch -e "require 'cltorch.unit_tensor'; tester = torch.Tester(); tester.countasserts = 0; cltorch.setAddFinish(1); cltorch.tests.tensor.test_fills()"

This is just going to run a single kernel, that sets a tensor to the same value everywhere. You can see the test in the 'tests/unit_tensor.lua' file, at line 419:

function cltorch.tests.tensor.test_fills()
  C = torch.ClTensor(3,2)
  A = torch.FloatTensor(3,2)
  C:fill(1.345)
  A:fill(1.345)
  tester:asserteq(A, C:float())

  C = torch.ClTensor(3,2)
  A = torch.FloatTensor(3,2)
  C:fill(1.345)
  A:fill(1.345)
  C:zero()
  A:zero()
  tester:asserteq(A, C:float())

  A = torch.FloatTensor.zeros(torch.FloatTensor.new(), 3, 5)
  C = torch.ClTensor.zeros(torch.ClTensor.new(), 3, 5)
  tester:asserteq(A, C:float())

  A = torch.FloatTensor.ones(torch.FloatTensor.new(), 3, 5)
  C = torch.ClTensor.ones(torch.ClTensor.new(), 3, 5)
  tester:asserteq(A, C:float())

end
Ambext commented 9 years ago

:-) that works at least

Exowide:~ mnemonis$ th -l cltorch -e "require 'cltorch.unit_tensor'; tester = torch.Tester(); tester.countasserts = 0; cltorch.setAddFinish(1); cltorch.tests.tensor.test_fills()" Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine statefultimer v0.6 left 1.3450 1.3450 1.3450 1.3450 1.3450 1.3450 [torch.FloatTensor of size 3x2]

right -1.7014e+38 -1.7014e+38 -1.7014e+38 -1.7014e+38 -1.7014e+38 -1.7014e+38 [torch.FloatTensor of size 3x2]

diff 1.7014e+38 1.7014e+38 1.7014e+38 1.7014e+38 1.7014e+38 1.7014e+38 [torch.FloatTensor of size 3x2]

left 0 0 0 0 0 0 [torch.FloatTensor of size 3x2]

right -4.2535e+37 -4.2535e+37 -4.2535e+37 -4.2535e+37 0.0000e+00 0.0000e+00 [torch.FloatTensor of size 3x2]

diff 4.2535e+37 4.2535e+37 4.2535e+37 4.2535e+37 0.0000e+00 0.0000e+00 [torch.FloatTensor of size 3x2]

left 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [torch.FloatTensor of size 3x5]

right 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [torch.FloatTensor of size 3x5]

diff 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [torch.FloatTensor of size 3x5]

hughperkins commented 9 years ago

well, it runs, but you see that each bit that says 'left' and 'right'? Those should be identical in each pair. The bits taht say 'diff'? those are the difference between left and right, and should be all zeros.

hughperkins commented 9 years ago

Maybe let's start with some EasyCL tests. Can you do the following please, which will install EasyCL, and run EasyCL unit tests? EasyCL is a wrapper about the opencl api, and handles wrapper opencl buffers and so on.

git clone --recursive https://github.com/hughperkins/EasyCL.git -b -no-internal-lua
cd EasyCL
mkdir build
cd build
ccmake ..
# press 'c' for configure
# fill in CMAKE_INSTALL_PREFIX with some absolute path, eg `/home/user/EasyCL/dist`, or whatever is convenient for you.  Best not to use a relative path
# change 'BUILD_TESTS' to 'ON'
# press 'c' for configure
# press 'g' for generate
make -j 4 install

At this point, you should probably have an executable easycl_unittests in your current directory. But it wont run, because it expects a bunch of opencl kernels - .cl files - in your current directory. So, do:

cp ../test/*.cl .

And then try:

./easycl_unittests
Ambext commented 9 years ago

I get a clone error Cloning into 'EasyCL'... fatal: Remote branch -no-internal-lua not found in upstream origin

hughperkins commented 9 years ago

sorry, should be no-internal-lua-option

Ambext commented 9 years ago

ok

hughperkins commented 9 years ago

like, git clone --recursive https://github.com/hughperkins/EasyCL.git -b no-internal-lua-option

Ambext commented 9 years ago

I am not sure cp ../test/*.cl . went through discard previous

Ambext commented 9 years ago

Exowide:build mnemonis$ ./easycl_unittests args: ./easycl_unittests --gtest_filter=-SLOW Note: Google Test filter = -SLOW [==========] Running 54 tests from 22 test cases. [----------] Global test environment set-up. [----------] 1 test from testscalars [ RUN ] testscalars.test1 found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine -156 -155 -154 -153 -152 3000 3001 3002 3003 3004 -2524653 -2524652 -2524651 -2524650 -2524649 1353523545 1353523546 1353523547 1353523548 1353523549 1.234 2.234 3.234 4.234 5.234 tests completed ok [ OK ] testscalars.test1 (116 ms) [----------] 1 test from testscalars (116 ms total)

[----------] 1 test from testintarray [ RUN ] testintarray.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine 5 8 26 14 17 7 10 34 16 19 tests completed ok [ OK ] testintarray.main (20 ms) [----------] 1 test from testintarray (20 ms total)

[----------] 3 tests from testfloatwrapper [ RUN ] testfloatwrapper.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testfloatwrapper.main (39 ms) [ RUN ] testfloatwrapper.singlecopytodevice Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testfloatwrapper.singlecopytodevice (2 ms) [ RUN ] testfloatwrapper.doublecopytodevice Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testfloatwrapper.doublecopytodevice (11 ms) [----------] 3 tests from testfloatwrapper (52 ms total)

[----------] 1 test from testclarray [ RUN ] testclarray.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testclarray.main (3 ms) [----------] 1 test from testclarray (4 ms total)

[----------] 1 test from testfloatwrapperconst [ RUN ] testfloatwrapperconst.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testfloatwrapperconst.main (3 ms) [----------] 1 test from testfloatwrapperconst (3 ms total)

[----------] 1 test from testintwrapper [ RUN ] testintwrapper.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testintwrapper.main (4 ms) [----------] 1 test from testintwrapper (4 ms total)

[----------] 1 test from test_scenario_te42kyfo [ RUN ] test_scenario_te42kyfo.main Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine finished [ OK ] test_scenario_te42kyfo.main (70 ms) [----------] 1 test from test_scenario_te42kyfo (70 ms total)

[----------] 1 test from testfloatarray [ RUN ] testfloatarray.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine 5 8 26 14 17 7 10 34 16 19 tests completed ok [ OK ] testfloatarray.main (18 ms) [----------] 1 test from testfloatarray (18 ms total)

[----------] 2 tests from testeasycl [ RUN ] testeasycl.main start found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testeasycl.main (3 ms) [ RUN ] testeasycl.power2helper [ OK ] testeasycl.power2helper (0 ms) [----------] 2 tests from testeasycl (3 ms total)

[----------] 1 test from testinout [ RUN ] testinout.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testinout.main (16 ms) [----------] 1 test from testinout (16 ms total)

[----------] 5 tests from testlocal [ RUN ] testlocal.globalreduce Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testlocal.globalreduce (772 ms) [ RUN ] testlocal.localreduce Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testlocal.localreduce (707 ms) [ RUN ] testlocal.reduceviascratch_multipleworkgroups Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine expected sum, calc'd via cpu, : 294906 [ OK ] testlocal.reduceviascratch_multipleworkgroups (5 ms) [ RUN ] testlocal.reduceviascratch_multipleworkgroups_ints Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine numworkgroups 256 workgroupsize 256 N 65536 [ OK ] testlocal.reduceviascratch_multipleworkgroups_ints (4 ms) [ RUN ] testlocal.reduce_multipleworkgroups_ints_noscratch Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine numworkgroups 256 workgroupsize 256 N 65536 [ OK ] testlocal.reduce_multipleworkgroups_ints_noscratch (4 ms) [----------] 5 tests from testlocal (1492 ms total)

[----------] 1 test from testdefines [ RUN ] testdefines.simple Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testdefines.simple (18 ms) [----------] 1 test from testdefines (18 ms total)

[----------] 1 test from testbuildlog [ RUN ] testbuildlog.main Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine testbuildlog.cl build log:

:4:4: error: use of undeclared identifier 'someerrorxyz' someerrorxyz; ^ kernel build error: kernel source: 1: kernel void foo( ) { 2: someerrorxyz; 3: } 4: 5: Something went wrong with clCreateKernel, code -45 testbuildlog.cl build log: :4:4: error: use of undeclared identifier 'someerrorxyz' someerrorxyz; ^ kernel source: 1: kernel void foo( ) { 2: someerrorxyz; 3: } 4: 5: Something went wrong with clCreateKernel, code -45 testbuildlog.cl build log: :4:4: error: use of undeclared identifier 'someerrorxyz' someerrorxyz; ^ [ OK ] testbuildlog.main (43 ms) [----------] 1 test from testbuildlog (43 ms total) [----------] 5 tests from testnewinstantiations [ RUN ] testnewinstantiations.createForFirstGpu Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testnewinstantiations.createForFirstGpu (4 ms) [ RUN ] testnewinstantiations.createForIndexedGpu Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testnewinstantiations.createForIndexedGpu (4 ms) [ RUN ] testnewinstantiations.createForIndexedDevice Using Apple platform: Apple Using device: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testnewinstantiations.createForIndexedDevice (66 ms) [ RUN ] testnewinstantiations.createForPlatformDeviceIndexes Using Apple platform: Apple Using device: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz [ OK ] testnewinstantiations.createForPlatformDeviceIndexes (1 ms) [ RUN ] testnewinstantiations.createForFirstGpuOtherwiseCpu Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testnewinstantiations.createForFirstGpuOtherwiseCpu (4 ms) [----------] 5 tests from testnewinstantiations (79 ms total) [----------] 1 test from testucharwrapper [ RUN ] testucharwrapper.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testucharwrapper.main (106 ms) [----------] 1 test from testucharwrapper (106 ms total) [----------] 2 tests from testkernelstore [ RUN ] testkernelstore.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testkernelstore.main (3 ms) [ RUN ] testkernelstore.cl_deletes found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine [ OK ] testkernelstore.cl_deletes (3 ms) [----------] 2 tests from testkernelstore (6 ms total) [----------] 1 test from testdirtywrapper [ RUN ] testdirtywrapper.main found opencl library Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine tests completed ok [ OK ] testdirtywrapper.main (3 ms) [----------] 1 test from testdirtywrapper (3 ms total) [----------] 2 tests from testDeviceInfo [ RUN ] testDeviceInfo.basic platformVendor: Apple platformName: Apple deviceType: 2 globalMemSize: 17179869184 localMemSize: 32768 globalMemCachelineSize: 8388608 maxMemAllocSize: 4294967296 maxComputeUnits: 8 maxWorkGroupSize: 1024 maxWorkItemDimensions: 3 deviceName: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz openClCVersion: OpenCL C 1.2 deviceVersion: OpenCL 1.2 maxClockFrequency: 4000 platformVendor: Apple platformName: Apple deviceType: 4 globalMemSize: 4294967296 localMemSize: 32768 globalMemCachelineSize: 0 maxMemAllocSize: 1073741824 maxComputeUnits: 32 maxWorkGroupSize: 256 maxWorkItemDimensions: 3 deviceName: AMD Radeon R9 M295X Compute Engine openClCVersion: OpenCL C 1.2 deviceVersion: OpenCL 1.2 maxClockFrequency: 850 [ OK ] testDeviceInfo.basic (0 ms) [ RUN ] testDeviceInfo.gpus platformVendor: Apple platformName: Apple deviceType: 4 globalMemSize: 4294967296 localMemSize: 32768 globalMemCachelineSize: 0 maxMemAllocSize: 1073741824 maxComputeUnits: 32 maxWorkGroupSize: 256 maxWorkItemDimensions: 3 deviceName: AMD Radeon R9 M295X Compute Engine openClCVersion: OpenCL C 1.2 deviceVersion: OpenCL 1.2 maxClockFrequency: 850 [ OK ] testDeviceInfo.gpus (0 ms) [----------] 2 tests from testDeviceInfo (0 ms total) [----------] 11 tests from testLuaTemplater [ RUN ] testLuaTemplater.basicsubstitution1 dyld: lazy symbol binding failed: Symbol not found: _luaL_newstate Referenced from: /Users/mnemonis/torch/install/lib/libEasyCL.dylib Expected in: flat namespace dyld: Symbol not found: _luaL_newstate Referenced from: /Users/mnemonis/torch/install/lib/libEasyCL.dylib Expected in: flat namespace Trace/BPT trap: 5
hughperkins commented 9 years ago

Hmmm, can you do cd build; ccmake .., and send a screenshot please? Just want to check the switch 'PROVIDE_LUA_ENGINE' mainly.

hughperkins commented 9 years ago

And also do:

nm libEasyCL.dylib | grep newstate

(nm shows the symbols defined inside a shared object, ie the names of the functions inside.)

Ambext commented 9 years ago

provide lua is on

git

Ambext commented 9 years ago

Exowide:build mnemonis$ nm libEasyCL.dylib | grep newstate 0000000000018bf0 T _luaL_newstate 000000000002d990 T _lua_newstate Exowide:build mnemonis$

hughperkins commented 9 years ago

Hmmm, those defines look prtty similar to mine, except the underscores, but I think that underscores are compiler-dependent:

$ nm libEasyCL.so | grep newstate
0000000000033770 T luaL_newstate
000000000004b750 T lua_newstate
hughperkins commented 9 years ago

the switches look correct. I would think that provide_lua should be 'on', as it is.

hughperkins commented 9 years ago

Oh... it is the test that is failing to link with lua. hmmm... pondering..

hughperkins commented 9 years ago

googling a bit on this topic doesnt throw up much. I found:

Following the frist one, I've added option -flat_namespace. Can you update from git, rebuild, and return the unit tests, and see what happens?

hughperkins commented 9 years ago

(if this doesnt work, I guess we can try -single_module instead, per the second link)

hughperkins commented 9 years ago

Another thing we could try possibly is reversing lines 52 adn 53 in the CMakeLists.txt file, so that the templater source comes first, and the linker notices it needs lua, and then voila, here's the lua, and the linker is happy. Now:

${lua_src}
    ${TEMPLATESRC} )

after swap:

    ${TEMPLATESRC}
${lua_src} )
hughperkins commented 9 years ago

(note that this build stuff is kind of the pointless stuff that I'm quite happy to deal with without interaction if you want; it's not teaching you anything about torch, or networks or anything, just some useless knowledge about linkers and stuff :-P )

hughperkins commented 9 years ago

Hmmm, random thought. one other option is, you set up a jenkins server, and point it at my repo, and I can simply click 'build' when I want, and ti will suck down the latest code from my repo, and run it. You'd have logfiles in jenkins, and you'd be able to see the code history in github. It's a fairly transparent, audited, option, and relativley easy to set up.

Ambext commented 9 years ago

I have just ran git pull (1 PM AEST) and then ran the test

Same error as before.

----------] 1 test from testbuildlog [ RUN ] testbuildlog.main Using Apple platform: Apple Using device: AMD Radeon R9 M295X Compute Engine testbuildlog.cl build log:

:4:4: error: use of undeclared identifier 'someerrorxyz' someerrorxyz; ^ kernel build error: kernel source: 1: kernel void foo( ) { 2: someerrorxyz; 3: } 4: 5: Something went wrong with clCreateKernel, code -45 testbuildlog.cl build log: :4:4: error: use of undeclared identifier 'someerrorxyz' someerrorxyz; ^ kernel source: 1: kernel void foo( ) { 2: someerrorxyz; 3: } 4: 5: Something went wrong with clCreateKernel, code -45 testbuildlog.cl build log: :4:4: error: use of undeclared identifier 'someerrorxyz' someerrorxyz; ^ swapping the lines in Cmake yields the following error CMake Error: Error in cmake code at /path/Documents/Code_Ressources/EasyCL/CMakeLists.txt:55: Parse error. Expected a command name, got unquoted argument with text "${lua_src}". -- Configuring incomplete, errors occurred! See also "/Users/mnemonis/Documents/Code_Ressources/EasyCL/build/CMakeFiles/CMakeOutput.log". make: **\* [cmake_check_build_system] Error 1
hughperkins commented 9 years ago

for swapping the lines, can you check that the ) is still on the second of the two lines?

hughperkins commented 9 years ago

by the way, please ignore the message about someerrorxyz. Thats a pretend error. Its normal. Its to test that errors are trapped ok, which they are :-) The important bit is the message about the _lua_newstate symbol being missing.

Ambext commented 9 years ago

Good catch, sorry I missed it.

Still concludes by [----------] 11 tests from testLuaTemplater [ RUN ] testLuaTemplater.basicsubstitution1 dyld: lazy symbol binding failed: Symbol not found: _luaL_newstate Referenced from: /Users/mnemonis/torch/install/lib/libEasyCL.dylib Expected in: flat namespace

dyld: Symbol not found: _luaL_newstate Referenced from: /Users/mnemonis/torch/install/lib/libEasyCL.dylib Expected in: flat namespace

hughperkins commented 9 years ago

Ah, ok. By the way, to focus just on this test we can write:

./easycl_unittests tests=testLuaTemplater.basicsubst*

This will filter the tests so that only the failing lua templater test will run.

Ambext commented 9 years ago

ok

I am not sure how to try the "(if this doesnt work, I guess we can try -single_module instead, per the second link)" idea

hughperkins commented 9 years ago

Hmmm, I dont understand why it doesnt find the symbol to be honest. Normally the way linking works is that each shared object, and a dylib is a shared object, I guess, contains definitions of functions, but also lists functions it needs to get from other shared objects.

So, for example, EasyCL needs symbols from clew, which we can see like this:

$ nm libEasyCL.so | grep clew
                 U __clewBuildProgram
                 U __clewCreateBuffer
                 U __clewCreateCommandQueue
                 U __clewCreateContext
                 U __clewCreateKernel

All those lines with a 'U' are not present in EasyCL. EasyCL will need to load them from some other library at runtime. Which library? From clew:

$ nm ../dist/lib/libclew.so | grep Create
00000000002052a0 B __clewCreateBuffer
00000000002052c0 B __clewCreateCommandQueue
00000000002052e8 B __clewCreateContext
00000000002052e0 B __clewCreateContextFromType

You can see that each of these symbols in libclew as a number on the left, and a 'B' instead of a 'U'. So, libclew contains these symbols, and libEasyCL will link with libclew at runtime. How does libEasyCL know to link with libclew at runtime? its actually baked into the shared boejct, and you can view this using ldd:

$ ldd libEasyCL.so 
    linux-vdso.so.1 =>  (0x00007fffcffcd000)
    libclew.so.1.0.0 => /home/user/git/EasyCL/dist/lib/libclew.so.1.0.0 (0x00007f2812acb000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f28127a4000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f281249e000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f2812288000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2811ec2000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f2811cbe000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f2812f3d000)

On the third line, you can see it is searching for libclew, and so it will find the symbols at runtime.

Now, in our case, we have two relevant files: libEasyCL, and easycl_unittests:

If we use nm on libEasyCL, we can see that newstate is defined:

$ nm libEasyCL.so | grep newstate
0000000000033770 T luaL_newstate
000000000004b750 T lua_newstate

It's totally bizarre then that, at runtime, the linker says "well, I dont know where is a symbol called lua_newstate. cant find it. sorry. gave up :-(".

hughperkins commented 9 years ago

so, the key seems to be somethin gto do with namespaces, hence tyring the -flat_namespace option. This is defined in line 67 of CMakeLists.txt:

  IF(APPLE)
    SET_TARGET_PROPERTIES(EasyCL PROPERTIES LINK_FLAGS "-flat_namespace")
  ENDIF(APPLE)
hughperkins commented 9 years ago

I've updated it to read '-single_module' can you pull down the updates, rebuild, rerun, and see if this makes any difference?

hughperkins commented 9 years ago

Looking at the makefile for lua, it seems that on macosx, they add a compile-time define '-DLUA_USE_LINUX'. I've updated the CMakeLists.txt to add this option. Not sure if it makes any difference?

Ambext commented 9 years ago

I have just recloned the folder and do the whole thing and... ... still go this

dyld: lazy symbol binding failed: Symbol not found: _luaL_newstate Referenced from: /Users/mnemonis/torch/install/lib/libEasyCL.dylib Expected in: flat namespace

dyld: Symbol not found: _luaL_newstate Referenced from: /Users/mnemonis/torch/install/lib/libEasyCL.dylib Expected in: flat namespace

Can we recheck with the -flat_namespace option, I wonder if I did the pull / rebuild right at the time ?

hughperkins commented 9 years ago

Hmmm, sorry, I really dont know why it doesnt work. I need to play for a bit, try different random things, but I dont have a Mac. To be honest, the linking when using cltorch is clearly working ok, so we could just work from the cltorch side I suppose. I'm thinking though that maybe it could make sense to try using https://www.teamviewer.com/en/index.aspx perhaps? Then you can see what I'm doing, but dont have to type everything?

Ambext commented 9 years ago

I am sorry being such a pain on my side... ok for a teamviewer. Let's use the direct mail to set it up.

Can we still retry the -flat_namespace option, I think I didn't rebuild correctly before running the test again.

hughperkins commented 9 years ago

Ok. You should just be able to make the modificatoin directly in line 67 of CMakeLists.txt:

    SET_TARGET_PROPERTIES(EasyCL PROPERTIES LINK_FLAGS "-flat_namespace")

... and then rebuild/reinstall and so on.

Ambext commented 9 years ago

ok. Wasn't sure that was the only location. Will get things done tomorrow, today was off :-)

hughperkins commented 9 years ago

Yes, it's just a linker option.