Closed hery closed 9 years ago
Here are the dedicated GPU specs:
th> props = cltorch.getDeviceProperties(2)
th> props
{
deviceType : "GPU"
localMemSizeKB : 32
globalMemSizeMB : 2048
deviceVersion : "OpenCL 1.2 "
platformVendor : "Apple"
deviceName : "AMD Radeon R9 M370X Compute Engine"
maxComputeUnits : 10
globalMemCachelineSizeKB : 0
openClCVersion : "OpenCL C 1.2 "
maxClockFrequency : 800
maxMemAllocSizeMB : 512
maxWorkGroupSize : 256
}
Hi hery, thank-you for the bug report. Dont suppose... do you mind running luarocks install cltorch
, to update to latest cltorch, and then provide output of th -l cltorch -e 'cltorch.about()'
to confirm. I'm not saying that will fix the issue, but since that's easy to do, probably good to do that first, just to check.
(By the way, on my own computer, the following script:
require 'cltorch'
local function eval(expression)
loadstring('res=' .. expression)()
print(expression, res)
end
eval('cltorch.getDevice()')
-- Switch to dedicated GPU. Everything breaks if we uncomment those lines.
cltorch.setDevice(2)
cltorch.synchronize()
cltorch.finish() -- not sure this line is needed
print('Current device: ', cltorch.getDevice()) -- this prints out, but then hangs.
-- Things print out properly on the integrated GPU (device(1))
C = torch.ClTensor{{3,2,4},{9,7,5}}
print(C:t())
print(C:transpose(1,2))
eval('cltorch.getDeviceCount()')
eval('cltorch.getDevice()')
b = torch.ClTensor({3,5,2})
eval('b')
cltorch.setDevice(1)
eval('cltorch.getDevice()')
a = torch.ClTensor({2,4,7})
eval('a')
eval('a:add(2)')
cltorch.setDevice(2)
eval('b:add(2)')
produces the following output:
$ th testhery.lua
cltorch.getDevice() 1
Using NVIDIA Corporation platform: NVIDIA CUDA
Using device: GeForce 940M
Current device: 2
3 9
2 7
4 5
[torch.ClTensor of size 3x2]
3 9
2 7
4 5
[torch.ClTensor of size 3x2]
cltorch.getDeviceCount() 2
cltorch.getDevice() 2
b 3
5
2
[torch.ClTensor of size 3]
cltorch.getDevice() 1
Using Intel platform: Intel Gen OCL Driver
Using device: Intel(R) HD Graphics BroadWell U-Processor GT2
a 2
4
7
[torch.ClTensor of size 3]
a:add(2) 4
6
9
[torch.ClTensor of size 3]
b:add(2) 5
7
4
[torch.ClTensor of size 3]
.... hence I cannot reproduce the problem on my own machine. Hence this is going to need some digging :-/ Hence, why, let's check the obvious things first :-)
Hi Hugh, thanks for taking the time to look into this!
Here's the output of the about()
command after updating cltorch
.
pandaman$ th -l cltorch -e 'cltorch.about()'
cltorch. OpenCL backend for Torch
Built from commit 3e6d445
More info, doc: https://github.com/hughperkins/cltorch
Ok, and problem is still there?
Yes, still there!
Hmmm... thats odd... it doesnt do very much at that point, just creates a command queue, and calls clFinish()
on it. Its a mystery. I think I might try seeing if you can run EasyCL unit tests on it. There are two challenges for this:
Hmmm, whilst I'm writing this, one idea occurs to me: if you go into /etc/OpenCL/vendors, you should see two '.icd' files. If you rename the one for your first gpu to have a '_' suffix, then it will no longer be available to opencl, and so you dont need to change device. If you do this, do things work ok? Meaning: is the problem related to changing devices? Or is it a problem more to with something about the device itself?
I don't have an /etc/OpenCL directory. (running OS X 10.10)
If running EasyCL unit tests can help, I can look into that. Or I can play with the source and see where it breaks, if that's an option.
Hmmm.... ok... it breaks at such an early stage, we could probably start by just running some really simple 'hello world' type program, like https://developer.apple.com/library/mac/samplecode/OpenCL_Hello_World_Example/Listings/hello_c.html , and then go from there. You'd wnat to change the clGetDeviceIds line, and the clCreateContext line, to get it to choose your second gpu.
However, if you can get hte easycl tests building/compiling, then that would rock. Actually, we could just disable all the lua stuff in easycl tests, and build it like that. I used to have an option to do that, I took it away, I might add it back...
Looking into the hello world program now. I like how complicated it is to select the offline GPU.
https://developer.apple.com/library/mac/technotes/tn2335/_index.html
Ok. You mean, if you're using the GPU to drive your display, cannot select it for use with OpenCL?
I'm not sure, that could be the issue. This article refers to the Mac Pro, but I'm on a Macbook Pro. Let me try to run the cltorch
script without an external display.
Great, it didn't freeze without the external monitor. This could definitely be the issue. But it didn't work either haha. Here's the output:
pandaman$ th -l hello
Current device: 1
Using Apple platform: Apple
Using device: AMD Radeon R9 M370X Compute Engine
Current device: 2
0.0000 -2.0000
0.0000 -2.0000
0.0000 0.0000
[torch.ClTensor of size 3x2]
Abort trap: 6
Looks like the Tensor printed properly, but not its transpose? Also, I'm using gfxCardStatus to see which GPU is in use, and it doesn't seem to switch to the dedicated GPU when I run the script.
Hmm never mind, the dedicated GPU can't seem to work. The tests run fine on the integrated GPU, but it aborts when I run them on the dedicated GPU.
th> cltorch.getDevice()
2
[0.0001s]
th> cltorch.test()
running tests...
aftter requiring cltorch.unit_storage
Running 1 tests
| ==> test_basic
Using Apple platform: Apple
Using device: AMD Radeon R9 M370X Compute Engine
_ ==> Done Completed 11 asserts in 1 tests with 0 errors
--------------------------------------------------------------------------------
aftter requiring cltorch.unit_tensor
Running 91 tests
|__________________________________________________________________________________________ ==> outplace_div
left
1.1765 0.5882 -0.2941
0.9118 0.3529 1.4412
[torch.FloatTensor of size 2x3]
right
0.0000e+00 3.6893e+19 0.0000e+00
3.6893e+19 3.4513e-31 0.0000e+00
[torch.FloatTensor of size 2x3]
*|_________________________________________________________________________________________ ==> test_addcmul
Abort trap: 6
Now when I run the tests after manually switching to the dedicated GPU using gfxCardStatus, it freezes. It correlates to when an external monitor is plugged, because using an external monitor forces the computer to use the dedicated GPU.
So I'm pretty certain that it never runs any code on the dedicated GPU, which is confirmed by both gfxCardStatus and the activity monitor. (see screenshot below, which shows the integrated gpu is in use)
I guess I'll start looking into the EasyCL tests as we discussed, will keep you updated.
Ok. Yes, it sounds like it's quite a low-level issue, ie drivers etc. Did you manage to get the helloworld.c running ok on the dedicated gpu? (edit: I mean, the c-program helloworld, rather than the lua-version?)
I ran the hello world program on the integrated GPU, but I didn't get it to run on the dedicated GPU yet.
Ok. Until the helloworld.c program, from the mac website, runs ok on the dedicated gpu, I dont think that cltorch is going to get very far. Seems like some kind of low-level driver/configuration problem, right?
I agree, and so does this article, which says the gpu drivers on OS X are broken. No luck here, can't use CUDA either since it's an AMD gpu. I'm going to set up an Arch Linux dual boot on my machine, we may have more luck there.
it's a problem of AMD apparently, on an old mac with Nvidia I can run cltorch on dedicated while on integrated or on dedicated, no problem
@hery, ok, so, do you mind if I close this issue? Seems like it is not related to cltorch itself right?
@szagoruyko Good info. Thanks!
@hughperkins Yea go ahead, thanks!
Hello! I'm having this issue where my computer freezes completely (besides the mouse) when running
cltorch
on a dedicated GPU, running a simple script to test my environment.The lines that are commented out are the ones that make everything freeze. Any clue what could be going on, or what I can do to get more descriptive logs? I am loading the script as
require 'hello'
in ath
prompt, but it hangs even when running the commands individually.