snugel / cas-offinder

An ultrafast and versatile algorithm that searches for potential off-target sites of CRISPR/Cas-derived RNA-guided endonucleases.
Other
85 stars 27 forks source link

Segmentation fault in GPU mode on Mac #2

Closed harijay closed 10 years ago

harijay commented 10 years ago

Hi Thank you for this excellent application. I was able to get it to work on Linux without a problem on CPU and GPU. However when I tried using your supplied binary and a self-compiled binary on Mac OSX 10.9.2 (64 bit) with a Iris 1024 OpenCL supported GPU, the program gives a "Segmentation fault" as soon as it starts. On the Mac, the CPU mode works fine but takes a while.

Here is what the debugger says :

Haris-MacBook-Pro:~ hari$ ./cas-offinder_mac testinput.txt G cas_gpu_official Segmentation fault: 11

The debugger outputs:

(lldb) run Process 7356 launched: './cas-offinder_mac' (x86_64) Process 7356 stopped

I also compiled my own binary from source and that too segmentation faults on the same Macbook Apples web page suggest that OpenCl MacBook Pro (Retina, 13-inch, Late 2013) Intel Iris Graphics support OpenCL 1.2. I was also able to run Apples demo OpenCL application on the same hardware.

Any ideas on how to get GPU mode working on Mac. Thanks Hari

snugel commented 10 years ago

Thank you for your report.

Recently we also noticed that similar problems also occur on some Intel Graphics platforms (on Windows and Linux).

We are not fully sure, however, we suspect that perhaps the OpenCL memory management on Intel Graphics platforms is still premature for such an algorithm, which needs to allocate unusual huge amount of memory. We couldn't tested it on OSX platform because we don't have any Mac with Intel Graphics, however, we think that what you encountered is also the same problem.

We are trying to make a workaround for the issue, hopefully fixed soon.

Jeongbin

harijay commented 10 years ago

Thank you Jeongbin for your excellent program and timely help.

I have been trying to use your excellent program on a GPU instance on Amazon EC2 compute cloud with an NVIDIA K520 GRID GPU with nvidia drivers and opencl support installed.

I am seeing that for 2 guides it takes around 2 minutes to calculate the off-targets ..for 400 guides it takes around 18 minutes to 30 minutes on the single K520 GRID GPU.

Are you running the program on a GPU cluster and if so how many nodes are you using to run your calculations.

I am new to opencl and am suprised that 400 guides took so long to calcluate off targets for..but then I may be doing something wrong.

Thank you for your help

Hari

On Thu, Apr 17, 2014 at 6:57 AM, snugel notifications@github.com wrote:

Thank you for your report.

Recently we also noticed that similar problems also occur on some Intel Graphics platforms (on Windows and Linux).

We are not fully sure, however, we suspect that perhaps the OpenCL memory management on Intel Graphics platforms is still premature for such an algorithm, which needs to allocate unusual huge amount of memory. We couldn't tested it on OSX platform because we don't have any Mac with Intel Graphics OSX, however, we think that what you encountered is also the same problem.

We are trying to make a workaround for the issue, hopefully fixed soon.

Jeongbin

— Reply to this email directly or view it on GitHubhttps://github.com/snugel/cas-offinder/issues/2?utm_campaign=website&utm_source=sendgrid.com&utm_medium=email#issuecomment-40702710 .

snugel commented 10 years ago

Yes, we also tried using Cas-OFFinder on cluster environment, run by a group in SNU - named Chundoong (http://chundoong.snu.ac.kr), with 4x AMD Radeon HD 7970 are installed per node.

On cluster Chundoong, it took about 0.7 sec per target, which is as twice as slow than our local environment - with 2x AMD Radeon HD 7870. I think that's because I/O time, usually slower than local environment, affects on the total computation time.

And... unfortunately, K520 is not so good as HD 7970. Actually K520 is for cloud gaming purpose, not best fit with such a computation. Here is a comparison table: https://compubench.com/compare.jsp?benchmark=clb11&config_0=17956951&config_1=11905561

But still, your calculation speed is unusually slower than my expectation.

Maybe your K520 does not return its maximum allocatable memory properly. Some NVidia cards return 1/4 of its maximum value, even though they can actually allocate much more than the value. Please try setting the variable manually - comment out line 275 of main.cpp, and set proper amount of memory (in bytes) at line 274, instead of '0'. The onboard RAM on K520 is 4GB per GPU, then safely set it 3.5GB and see the result again.

Jeongbin

snugel commented 10 years ago

Added an import fact!

Just now I added a 'Release build' flag to the CMakeLists.txt, which enables full optimization while compilation. I haven't noticed that because till now I've used my own build script instead of running CMake. Please download and replace CMakeLists.txt, and compile again.

Thank you for your report again.

Jeongbin

harijay commented 10 years ago

Thanks Jeongbin..

I was using a self compiled (using cmake built off commit 1f4907... ) version for the K520 test. So it probably contributed to part of the slowness since it didnt have the "Release build" flag you just put in.

I am also looking to see if Amazon has some ATI HD 7970 based machines .

I will recompile with the newest commit source and let you know how fast that runs on both the K520 and an ATI HD7970 if I find one.

Thank you so much for your help

Hari

On Tue, Apr 22, 2014 at 9:07 AM, snugel notifications@github.com wrote:

Added an import fact!

Just now I added a 'Release build' flag to the CMakeLists.txt, which enables full optimization while compilation. I haven't noticed that because till now I've used my own build script instead of running CMake. Please download and replace CMakeLists.txt, and compile again.

Thank you for your report again.

Jeongbin

— Reply to this email directly or view it on GitHubhttps://github.com/snugel/cas-offinder/issues/2#issuecomment-41036769 .

harijay commented 10 years ago

Hi Jeongbin,

Sorry to bother you..but the self built cas-offinder is not working. It keeps saying

[ec2-user@ip-172-31-32-198 cas-offinder]$ ./cas-offinder No OpenCL platforms found. Check OpenCL installation!

Here is what I did: git clone https://github.com/snugel/cas-offinder.git cd cas-offinder

This next step complains it did not find "cl.hpp" which I fixed by adding the ext directory to the include path.

[ec2-user@ip-172-31-32-198 cas-offinder]$ cmake -DOPENCL_LIBRARY=/usr/lib64/libOpenCL.so -DOPENCL_INCLUDEDIR=/usr/include . -- The C compiler identification is GNU 4.8.2 -- The CXX compiler identification is GNU 4.8.2 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Looking for include file dirent.h -- Looking for include file dirent.h - found -- Looking for include file CL/cl.hpp -- Looking for include file CL/cl.hpp - not found_ -- Configuring done -- Generating done -- Build files have been written to: /home/ec2-user/cas-offinder

[ec2-user@ip-172-31-32-198 cas-offinder]$ cmake -DOPENCL_LIBRARY=/usr/lib64/libOpenCL.so -DOPENCL_INCLUDE_DIR=/usr/include:./ext/ . -- Configuring done -- Generating done -- Build files have been written to: /home/ec2-user/cas-offinder

[ec2-user@ip-172-31-32-198 cas-offinder]$ make Scanning dependencies of target cas-offinder [100%] Building CXX object CMakeFiles/cas-offinder.dir/main.cpp.o /home/ec2-user/cas-offinder/main.cpp: In member function ‘void Genos::initOpenCL(cl_device_type)’: /home/ec2-user/cas-offinder/main.cpp:261:4: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] "}"; ^ Linking CXX executable cas-offinder [100%] Built target cas-offinder

Now when I run the casOFFinder it says no openCL installation

[ec2-user@ip-172-31-32-198 cas-offinder]$ ./cas-offinder No OpenCL platforms found. Check OpenCL installation!

Thank you for your continued help Hari

On Tue, Apr 22, 2014 at 1:39 PM, hari jayaram harijay@gmail.com wrote:

Thanks Jeongbin..

I was using a self compiled (using cmake built off commit 1f4907... ) version for the K520 test. So it probably contributed to part of the slowness since it didnt have the "Release build" flag you just put in.

I am also looking to see if Amazon has some ATI HD 7970 based machines .

I will recompile with the newest commit source and let you know how fast that runs on both the K520 and an ATI HD7970 if I find one.

Thank you so much for your help

Hari

On Tue, Apr 22, 2014 at 9:07 AM, snugel notifications@github.com wrote:

Added an import fact!

Just now I added a 'Release build' flag to the CMakeLists.txt, which enables full optimization while compilation. I haven't noticed that because till now I've used my own build script instead of running CMake. Please download and replace CMakeLists.txt, and compile again.

Thank you for your report again.

Jeongbin

— Reply to this email directly or view it on GitHubhttps://github.com/snugel/cas-offinder/issues/2#issuecomment-41036769 .

snugel commented 10 years ago

That is strange, in my case I can't find such a problem. Is your pre-built Cas-OFFinder, which was working properly, also working now? in the same environment?

Jeongbin

harijay commented 10 years ago

Hi Jeongbin, Sorry to bother you with these issues: Instead of the usual startup message I always got the message "No OpenCL platforms found. Check OpenCL installation!" with my compiled version of the most recent commit on the K520 GRID GPU on amazon ec2.

I then checked out a previous commit (1f4907c) of cas-offinder and then added to that commit the newest CmakeLists.txt and recompiled.Despite these changes , the run times are still very very slow (~ 100 s for two guides)

I then tried editing line 274 to have the memory specified explicitly in bytes: MAX_ALLOC_MEMORY.push_back(3758096384); /_devices[i].getInfo(CL_DEVICE_MAX_MEM_ALLOC_SIZE, &MAX_ALLOCMEMORY[i]);/

But that too did not result in any better speed.

Finally I tried the official binary that you have on sourceforge on the same K520 machine. I get the following error ( see below) , which seems to imply that nvidia does not support OpenCl 1.2 but only OpenCl 1.1. Could that be why everything is so different.

Thank you for your help

Hari

[ec2-user@ip-172-31-32-198 ~]$ ~/official_cas/cas-offinder-linux small_tmpin.txt G off1_out.txt /home/ec2-user/official_cas/cas-offinder-linux: /usr/lib64/libOpenCL.so.1: no version information available (required by /home/ec2-user/official_cas/cas-offinder-linux) /home/ec2-user/official_cas/cas-offinder-linux: /usr/lib64/libOpenCL.so.1: no version information available (required by /home/ec2-user/official_cas/cas-offinder-linux) /home/ec2-user/official_cas/cas-offinder-linux: relocation error: /home/ec2-user/official_cas/cas-offinder-linux: symbol clRetainDevice, version OPENCL_1.2 not defined in file libOpenCL.so.1 with link time reference

On Tue, Apr 22, 2014 at 9:12 PM, snugel notifications@github.com wrote:

That is strange, in my case I can't find such a problem. Is your pre-built Cas-OFFinder, which was working properly, also working now? in the same environment?

Jeongbin

— Reply to this email directly or view it on GitHubhttps://github.com/snugel/cas-offinder/issues/2#issuecomment-41114720 .

harijay commented 10 years ago

Great News. Thank you for all your help. I was able to get 400 RGENs off-target analyzed in 72 seconds on a machine similar to what you described with dual ATI Radeon 7870 GPUs .

I recompiled the cas-Offinder with the most recent github commit and the AMD supplied "linux-amd-catalyst-14.4-rc-v1.0-apr17".

I also downloaded and installed the AMD supplied "AMD-APP-SDK-v2.9-lnx64.tgz".

Finally I built cas-Offinder making sure I used the ATI supplied *.so and include files ( I had some earlier libOpenCL.so files in /usr/lib64 , but the timestamps suggested that they were probably from an nvidia driver I had used before)

cmake -DOPENCL_LIBRARY=/usr/lib/libOpenCL.so -DOPENCL_INCLUDE_DIR=/opt/AMDAPP/include

After the compile .. the search for 1 guide took a mere 50 seconds and 400 guides took 72 seconds. So it seems that the ATI drivers and openCL 1.2 support are more up to speed .

Thanks again for all your help Hari

On Wed, Apr 23, 2014 at 3:08 AM, hari jayaram harijay@gmail.com wrote:

Hi Jeongbin, Sorry to bother you with these issues: Instead of the usual startup message I always got the message "No OpenCL platforms found. Check OpenCL installation!" with my compiled version of the most recent commit on the K520 GRID GPU on amazon ec2.

I then checked out a previous commit (1f4907c) of cas-offinder and then added to that commit the newest CmakeLists.txt and recompiled.Despite these changes , the run times are still very very slow (~ 100 s for two guides)

I then tried editing line 274 to have the memory specified explicitly in bytes: MAX_ALLOC_MEMORY.push_back(3758096384); /_devices[i].getInfo(CL_DEVICE_MAX_MEM_ALLOC_SIZE, &MAX_ALLOCMEMORY[i]);/

But that too did not result in any better speed.

Finally I tried the official binary that you have on sourceforge on the same K520 machine. I get the following error ( see below) , which seems to imply that nvidia does not support OpenCl 1.2 but only OpenCl 1.1. Could that be why everything is so different.

Thank you for your help

Hari

[ec2-user@ip-172-31-32-198 ~]$ ~/official_cas/cas-offinder-linux small_tmpin.txt G off1_out.txt /home/ec2-user/official_cas/cas-offinder-linux: /usr/lib64/libOpenCL.so.1: no version information available (required by /home/ec2-user/official_cas/cas-offinder-linux) /home/ec2-user/official_cas/cas-offinder-linux: /usr/lib64/libOpenCL.so.1: no version information available (required by /home/ec2-user/official_cas/cas-offinder-linux) /home/ec2-user/official_cas/cas-offinder-linux: relocation error: /home/ec2-user/official_cas/cas-offinder-linux: symbol clRetainDevice, version OPENCL_1.2 not defined in file libOpenCL.so.1 with link time reference

On Tue, Apr 22, 2014 at 9:12 PM, snugel notifications@github.com wrote:

That is strange, in my case I can't find such a problem. Is your pre-built Cas-OFFinder, which was working properly, also working now? in the same environment?

Jeongbin

— Reply to this email directly or view it on GitHubhttps://github.com/snugel/cas-offinder/issues/2#issuecomment-41114720 .

snugel commented 10 years ago

That's great. I'll close this thread.

Jeongbin

harijay commented 9 years ago

Hi Sangsu Bae, I was wondering if you can recommend a good GPU alternative to AMD 7870 to run cas-offinder on. The AMD 7870s are getting hard to find. I have looked at the OpenCl benchmarking sites to look at alternative GPUs but dont know enough about OpenCl to look at the benchmark which most closely models the operations in cas-offinder.

If you could recommend an alternative GPU to the 7870 that would be great.

Thanks Hari

On Sun Apr 27 2014 at 2:34:35 AM snugel notifications@github.com wrote:

Closed #2 https://github.com/snugel/cas-offinder/issues/2.

— Reply to this email directly or view it on GitHub https://github.com/snugel/cas-offinder/issues/2.

pjb7687 commented 9 years ago

On behalf of Sangsu Bae,

Any AMD GPU better than 7870 would be okay. Currently we are using AMD R9 290X and it works great.

Jeongbin