clMathLibraries / clBLAS

a software library containing BLAS functions written in OpenCL
Apache License 2.0
839 stars 240 forks source link

Tuner crashes on Intel Iris on OS X #128

Open CNugteren opened 9 years ago

CNugteren commented 9 years ago

The tuner doesn't seem to work for level-3 BLAS routines on the Intel Iris GPU on OS X. This is the behaviour is see:

$ tune --gemm --float
GEMM is being tuned, progress:  0.02% Abort trap: 6
$ tune --syrk --float
SYRK is being tuned, progress:  0.20% Abort trap: 6
$ tune --gemv --float
GEMV is being tuned, progress:  2.34% (continues until completion)

This is the trace I obtain with lldb:

  * frame #0: 0x00007fff8cd68286 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff92b1d42f libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fff958bfb53 libsystem_c.dylib`abort + 129
    frame #3: 0x00007fff98db7b81 libGPUSupportMercury.dylib`gpusGenerateCrashLog + 173
    frame #4: 0x000012340044249f AppleIntelHD5000GraphicsGLDriver`gpusKillClient + 9
    frame #5: 0x00007fff98db8538 libGPUSupportMercury.dylib`gpusQueueSubmitDataBuffers + 170
    frame #6: 0x0000123400497909 AppleIntelHD5000GraphicsGLDriver`IntelCLCommandBuffer::getNew(GLDQueueRec*) + 33
    frame #7: 0x0000123400495817 AppleIntelHD5000GraphicsGLDriver`intelSubmitCLCommands(GLDQueueRec*, unsigned int) + 65
    frame #8: 0x00001234004ab799 AppleIntelHD5000GraphicsGLDriver`CHAL_INTEL::ChalContext::ChalFlush() + 83
    frame #9: 0x0000123400497bde AppleIntelHD5000GraphicsGLDriver`gldFlushQueue + 37
    frame #10: 0x00007fff953804f0 OpenCL`___lldb_unnamed_function82$$OpenCL + 46
    frame #11: 0x00007fff9539d276 OpenCL`clFlush + 170
    frame #12: 0x000000010000272c tune`runKernel + 540
    frame #13: 0x000000010000447f tune`runAllKernel + 1167
    frame #14: 0x000000010000660b tune`createFile + 3451
    frame #15: 0x0000000100007250 tune`main + 256
    frame #16: 0x00007fff8c1ff5c9 libdyld.dylib`start + 1

I tried with both the 2.2.0 release and the current development version (2.7.0). In both cases the library is compiled with LLVM and everything else (apart from the tuner) works fine. I am running on a MacBook Pro 13-inch (Late 2013) with an Intel Core i5 and an Intel Iris GPU on OS X 10.10.4.

A question possibly related: the tuner only creates an Iris.kdb file and doesn't tune for the CPU. I know that the Apple OpenCL drivers are weird for CPUs - but how does the tuner know not to tune for the CPU (OpenCL device 0) and to choose the GPU instead (device 1)?