antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

ModernGPU initialisation fails in Alenka due to 'cudaErrorInvalidDeviceFunction' error #67

Closed Randolph42 closed 10 years ago

Randolph42 commented 10 years ago

Changes in Alenka in the last month have caused problems with the modernGPU initialisation (on Linux).

Although its not very clear from the modernGPU code, the method: DeviceGroup::GetByOrdinal(int ordinal);

Unfortunately, the only symptom is the run time error: NOT COMPILED WITH COMPATIBLE PTX VERSION FOR DEVICE 0 This CUDA executable was not compiled with support for device 0 (sm_35)

I can't fathom what change in Alenka is causing this.

I have tried this with: drivers, 337.25, 3319.46, 3319.32 GPU's GTX 650 TI and GTX 780 TI

My system is: cuda 6 (tried 5.5) ModerGPU from the 10th April 2014 (latest) Linux dev 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux i7 16GB RAM GTX 780 TI or GTX 650 T

[severity ***]

kgdad commented 10 years ago

Randolph,

I haven't seen these issues with my setup: Cuda 5.5 Latest ModernGPU Linux 3.8.0-35-generic GeForce GTX 660Ti or GeForce GT 640

Rob

Randolph42 commented 10 years ago

Hmm, that suggests the difference is the linux version. Anyone else running on centos 6.4 or equiv?

On 6/06/14 10:28 PM, "kgdad" notifications@github.com wrote:

Randolph,

I haven't seen these issues with my setup: Cuda 5.5 Latest ModernGPU Linux 3.8.0-35-generic GeForce GTX 660Ti or GeForce GT 640

Rob

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/67#issuecomment-45331147 .

Randolph42 commented 10 years ago

Update: I've reinstalled MGPU, GPU driver, compiled fresh a download from scratch and applied all Centos updates. No change.

Does anybody have the latest version of Alenka running on Centos?

kgdad commented 10 years ago

I think all of our systems have Ubuntu installed. I believe we have some CentOS but they do not have a compatible GPU on those systems.

Do you have multiple GPUs installed on your system? If so, do you have the CUDA_VISIBLE_DEVICES environment variable set to point to a specific GPU that perhaps isn't compatible with sm_35? Maybe you can try compiling Alenka with support for something lower like sm_20 to see if that helps.

Randolph42 commented 10 years ago

No,only 1GPU at a time. I¹ve tried using sm_20 it still recognises sm_35 or 30 during initialisation, depending on the GPU.

What is the earliest kernel you are running with Alenka? R

On 12/06/14 8:57 PM, "kgdad" notifications@github.com wrote:

I think all of our systems have Ubuntu installed. I believe we have some CentOS but they do not have a compatible GPU on those systems.

Do you have multiple GPUs installed on your system? If so, do you have the CUDA_VISIBLE_DEVICES environment variable set to point to a specific GPU that perhaps isn't compatible with sm_35? Maybe you can try compiling Alenka with support for something lower like sm_20 to see if that helps.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/67#issuecomment-45878094 .

kgdad commented 10 years ago

Strange. We are running successfully with the following configurations: Machine 1: OS: 3.8.0-35 GPU: GeForce GT 640 and/or GeForce GTX 660 Ti Nvidia Driver Version: 319.37

Machine 2: OS: 3.11.0-20 GPU: GeForce GTX Titan Black Nvidia Driver Version: 331.62

Do you know what the compute capability of your card is?

Randolph42 commented 10 years ago

Gtx 780 ti = Sm_35 Gtx 650ti = sm_30

The difference is the OS. I¹m running 2.6.32 That¹s the latest kernel on Centos 6.5

What flavours are your kernels? Ubuntu?

Redhat 7.0 was released yesterday, centos will be a few weeks.

Hence my question: Is anybody running on earlier linux kernels (pre- 3.x)

On 12/06/14 9:13 PM, "kgdad" notifications@github.com wrote:

Strange. We are running successfully with the following configurations: Machine 1: OS: 3.8.0-35 GPU: GeForce GT 640 and/or GeForce GTX 660 Ti Nvidia Driver Version: 319.37

Machine 2: OS: 3.11.0-20 GPU: GeForce GTX Titan Black Nvidia Driver Version: 331.62

Do you know what the compute capability of your card is?

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/67#issuecomment-45879269 .

kgdad commented 10 years ago

Yes, we are using Ubuntu.

Randolph42 commented 10 years ago

I (upgraded?) to ubuntu to get the later 3.11 kernel This seems to have worked. As far as I can tell alenka no longer works with 2.6 linux kernels

Randolph42 commented 10 years ago

PS: cant wait for centos 7 to come out. I had a look at oracle Linux 7 which is free but it requires agreeing to sacrifice your firstborn etc... So I will have to use Ubuntu for a while.