tdulcet / Distributed-Computing-Scripts

🖧 Distributed Computing Scripts for GIMPS, BOINC and Folding@home
MIT License
18 stars 12 forks source link

GPU Exception on Colab #21

Closed BenjaminMichaelis closed 11 months ago

BenjaminMichaelis commented 11 months ago

I seem to have trouble running the gpu script on colab, seems like maybe an upstream issue in gpuowl (a similar issue is here: https://github.com/preda/gpuowl/issues/223 ). The CPU script works great in the meantime! Any suggestions?

20231120 20:41:27  GpuOwl VERSION v7.2.1-8-g3567e66
20231120 20:41:27  GpuOwl VERSION v7.2.1-8-g3567e66
20231120 20:41:27  config: -user Axis4383 -cpu Colab-BProductions
20231120 20:41:27  config: -mprimeDir ../../mprime_gpu -log 100000 -unsafeMath -use ROE2 -device 0 -maxAlloc 972M -proof 9 
20231120 20:41:27  device 0, unique id ''
20231120 20:41:27 Colab-BProductions 120218741 FFT: 6.50M 1K:13:256 (17.64 bpw)
20231120 20:41:27 Colab-BProductions  Exception gpu_error:  clGetPlatformIDs(16, platforms, (unsigned *) &nPlatforms) at src/clwrap.cpp:71 getDeviceIDs
20231120 20:41:27 Colab-BProductions  Bye
^C
Gracefully exiting...
tdulcet commented 11 months ago

Yeah, it is not an issue with GpuOwl, but with OpenCL on Colab, which unfortunately is currently broken. See #18 for more information.

In the meantime, I would suggest temporally using the previous version of our GPU notebook, which uses CUDALucas. See the note I recently added near the top of our Colab README: https://github.com/tdulcet/Distributed-Computing-Scripts/blob/eac3911950422ffa26ce50477904b2d7d074cf3d/google-colab/README.md?plain=1#L10

I am glad to hear that our CPU notebook is working for you.

\CC @Danc2050

BenjaminMichaelis commented 11 months ago

Yeah, it is not an issue with GpuOwl, but with OpenCL on Colab, which unfortunately is currently broken. See #18 for more information.

In the meantime, I would suggest temporally using the previous version of our GPU notebook, which uses CUDALucas. See the note I recently added near the top of our Colab README:

https://github.com/tdulcet/Distributed-Computing-Scripts/blob/eac3911950422ffa26ce50477904b2d7d074cf3d/google-colab/README.md?plain=1#L10

I am glad to hear that our CPU notebook is working for you.

\CC @Danc2050

Ah thank you! This explains it, I must of missed that issue.

I can close the issue now unless you want me to keep it open for visibility

tdulcet commented 11 months ago

No problem. I will fix the syntax of that note in my next commit, so that it shows up more prominently on the Colab README.

Hopefully Nvidia and/or Colab will fix this OpenCL issue soon...