Closed mw66 closed 7 years ago
Hi Mingwu,
Normally it will probably be because the OpenCL drivers for that GPU have not been installed. Can you provide the output of clinfo
please? (You might need to sudo apt-get install clinfo
first)
Indeed I need to run with 'sudo su', and then it can see 2 GPUs.
running require cltorch... ... require cltorch done num devices: 2 device properties, device 1 deviceType GPU localMemSizeKB 48 globalMemSizeMB 4095 deviceVersion OpenCL 1.2 CUDA platformVendor NVIDIA Corporation deviceName GeForce GTX 960 maxComputeUnits 8 globalMemCachelineSizeKB 0 openClCVersion OpenCL C 1.2 maxClockFrequency 1367 maxMemAllocSizeMB 1023 maxWorkGroupSize 1024 device properties, device 2 deviceType GPU localMemSizeKB 32 globalMemSizeMB 4052 deviceVersion OpenCL 2.0 AMD-APP (1598.5) platformVendor Advanced Micro Devices, Inc. deviceName Hawaii maxComputeUnits 40 globalMemCachelineSizeKB 0 openClCVersion OpenCL C 2.0 maxClockFrequency 947 maxMemAllocSizeMB 2867 maxWorkGroupSize 256 Using NVIDIA Corporation , OpenCL platform: NVIDIA CUDA Using OpenCL device: GeForce GTX 960 c1 7 -4 5 [torch.ClTensor of size 3]
7 4 5 [torch.ClTensor of size 3]
Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing Using OpenCL device: Hawaii c1 7 -4 5 [torch.ClTensor of size 3]
7 4 5 [torch.ClTensor of size 3]
Now my question is why I can only run OpenCL on R290 as 'su'? but not as ordinary user?
Is there any way I can fix this?
Thanks.
BTW, I'm 'ssh -X' into that machine, and running with 'sudo'.
Does it matter?
Yes, probably. GPU drivers in general tend to prefer one uses them from a desktop enviornment, which is using that GPU. As far as I know, AMD GPUs are no less picky in this respect. Unfortunately I dont have am AMD GPU to test with, and prevoius threads on the amd list, eg https://github.com/amd/OpenCL-caffe/issues/13 never seemed to get resolved. I'm not really sure how to solve this to be honest.
For NVIDIA, on linux, it tends to be sufficient to just run with sudo.
For NVIDIA, on Windows, I find I have to use vnc to connect to the desktop (eg rdesktop doesnt work correctly, inserts some other driver into the video stack somehow).
Maybe you can try vnc perhaps??? The best thing would be to check with some support guy, but, there's a lot of AMD guys in that thread I just linked to, so ... ????
Guess I have to use 'sudo su'.
The next question is how do I tell cltorch which GPU device to run?
Guess I have to use 'sudo su'.
Cool. If that works, thats excellent info :-)
The next question is how do I tell cltorch which GPU device to run?
In theory it hsould be like:
cltorch.setDevice(2)
... for the second gpu, or:
cltorch.setDevice(1)
... for the first one
Just a small comment as I also was facing this problem.
First running scripts with sudo in linux is kind of unsafe.
I didn't test with Nvidia or AMD graphic cards but I had the same problem with Intel HD integrated graphics. The trick was to add the user to the group "video"
sudo usermod -a -G video $LOGNAME
and to close the ssh session and reconnect again. then clinfo displays your gpu with no need for sudo.
Although this issue is a little old, but I would like to add some notes. Hope this can help others.
In my case /dev/kfd is root:render
instead of root:video
:
crw-rw---- 1 root render 237, 0 Feb 26 15:15 /dev/kfd
So you would need to use sudo usermod -a -G render $LOGNAME
to do this.
$ lspci
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii PRO [Radeon R9 290] 02:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 960](rev a1)
$ luajit -l cltorch ./src/test/test-device.lua running require cltorch... ... require cltorch done num devices: 1 device properties, device 1 deviceType GPU localMemSizeKB 48 globalMemSizeMB 4095 deviceVersion OpenCL 1.2 CUDA platformVendor NVIDIA Corporation deviceName GeForce GTX 960 maxComputeUnits 8 globalMemCachelineSizeKB 0 openClCVersion OpenCL C 1.2 maxClockFrequency 1367 maxMemAllocSizeMB 1023 maxWorkGroupSize 1024 Using NVIDIA Corporation , OpenCL platform: NVIDIA CUDA Using OpenCL device: GeForce GTX 960 c1 7 -4 5 [torch.ClTensor of size 3]
7 4 5 [torch.ClTensor of size 3]
Thanks.