ganyc717 / Darknet-On-OpenCL

Darknet On OpenCL
MIT License
101 stars 44 forks source link

OpenCL error -13 on Ubuntu 14.04 and beignet #6

Open OliverUrbann opened 6 years ago

OliverUrbann commented 6 years ago

Used current clBLAS as the provided .so leads to many linker errors. darknet_cl compiles now but executing

build/darknet_cl detect cfg/yolo.cfg yolo.weights data/dog.jpg

leads to some warnings and -13:

31 detection mask_scale: Using default '1.000000' Beignet: "unable to find good values for local_work_size[i], please provide\n" " local_work_size[] explicitly, you can find good values with\n" " trial-and-error method." Loading weights from yolo.weights...Done! im2col_kernels.cl build log: stringInput.cl:36:18: warning: '/*' within block comment

activation_kernels.cl build log: stringInput.cl:21:12: warning: double precision constant requires cl_khr_fp64, casting to single precision

opencl execution error, code -13 -13 terminate called after throwing an instance of 'std::runtime_error' what(): OpenCL error, code: -13 [1] 9856 abort (core dumped) build/darknet_cl detect cfg/yolo.cfg yolo.weights data/dog.jpg

ganyc717 commented 6 years ago

Hi @OliverUrbann Error code -13 means CL_MISALIGNED_SUB_BUFFER_OFFSET, means when you create a sub buffer from a GPU buffer, there is limitations on the offset alignment. This is hardware limitations and depending on GPU vendor. Currently I have no good way to resolve this issue as it will increase the complexity for the data layout on memory. Anyway, I only met such problem when using OpenCL running on CPU. Best Regards!

OliverUrbann commented 6 years ago

I tried it on a Atom J1900 with z3700 GPU in an embedded system. We cannot use other GPUs so the OpenCL port would be a great way to execute YOLO on that system. Anyway, thx for the explanation.

mhahn0106 commented 6 years ago

@ganyc717 I managed to compile and run darkness_cl on AMD Radeon GPU by adding some memory alignment codes but getting wrong detection result compared with CPU version. I hope you to explain design of CLArray. Can you leave some notes for it? To my understanding, it seems like making slices of given large continuous memory block. If we can focus on continuity, how about using SVM(Shared Virtual Memory) from OpenC L2.0?

ganyc717 commented 6 years ago

Hi @mhahn0106 CLArray is a ugly design as I have no idea to random access to the cl_mem buffer in CL 1.2. I am not quite familiar with OpenCL 2.0 new APIs. Thank you for your idea, I will check the spec and follow your idea later. Best Regards!

ganyc717 commented 6 years ago

Hi @OliverUrbann @mhahn0106 I have create a new branch enable-shared-virtual-memory, this branch using shared virtual memory instead of cl_mem. I tested it on Intel(R) Core(TM) i5-4570 CPU with opencl, and won't met CL_MISALIGNED_SUB_BUFFER_OFFSET again. I kept the CLArray design as the clBLAS APIs need cl_mem instead of svm pointers. Best Regards!

mhahn0106 commented 6 years ago

Awesome. Looks good. Will test on my box and let you know.

OliverUrbann commented 6 years ago

That's great, thx a lot for your efforts.

However:

-- Looking for CL_VERSION_2_1 - not found -- Looking for CL_VERSION_2_0 -- Looking for CL_VERSION_2_0 - found -- Found OpenCL: /usr/lib/x86_64-linux-gnu/libOpenCL.so (found version "2.0")

So I have OpenCL 2.0. And get:

gemm.cpp:(.text+0x77f): Nicht definierter Verweis auf clEnqueueSVMMap' gemm.cpp:(.text+0x7c4): Nicht definierter Verweis aufclEnqueueSVMMap' gemm.cpp:(.text+0x808): Nicht definierter Verweis auf clEnqueueSVMMap' gemm.cpp:(.text+0x9d9): Nicht definierter Verweis aufclEnqueueSVMUnmap' gemm.cpp:(.text+0x9fd): Nicht definierter Verweis auf clEnqueueSVMUnmap' gemm.cpp:(.text+0xa21): Nicht definierter Verweis aufclEnqueueSVMUnmap'

It's german for undefined reference to. I'm a bit confused. According to the beignet doc OpenCL 2.0 is only supported on Skylake and up, so that would explain this, abut not why OpenCL 2.0 is found.

soulslicer commented 6 years ago

@ganyc717

"CL_MISALIGNED_SUB_BUFFER_OFFSET, means when you create a sub buffer from a GPU buffer, there is limitations on the offset alignment."

Can you explain what this means to me in more detail. What do you mean by limitations of offset alignment. Can you provide a simple example?

ganyc717 commented 6 years ago

@soulslicer You have a cl_mem buffer, and wanna to create a sub buffer on it, but it does not mean an arbitrary offset is acceptable. You can refer to https://www.khronos.org/registry/OpenCL/sdk/1.1/docs/man/xhtml/clCreateSubBuffer.html

chenxian9999 commented 6 years ago

@ganyc717 Thx a lot for your work. I run branch Master on Win10 with a AMD GPU, and i recieved a error -13. then,i run on branch: test-enable-shared-virtual-memory, i recieved error -14 in void fill_gpu(int N, float ALPHA, CLArray X, int INCX) So what should I do next?

ganyc717 commented 6 years ago

@chenxian9999 I enable the shared virtual memory on different platforms, some works but some failed with error -14. I will find why get such error.

chenxian9999 commented 6 years ago

Thx a lot. Looking forward to your improvement.

aaronsta1 commented 6 years ago

i too am getting error -13 has there been a fix yet? i did get this one to compile, but its an old build https://github.com/myestro/darknet

`im2col_kernels.cl build log: /tmp/OCL2394T5.cl:36:18: warning: '/' within block comment //data_col_ptr = data_im_ptr[ii * width + jj]; ^ 1 warning generated.

opencl execution error, code -13 -13 terminate called after throwing an instance of 'std::runtime_error' what(): OpenCL error, code: -13 `