naibaf7 / caffe

Caffe: a fast open framework for deep learning. With OpenCL and CUDA support.
http://caffe.berkeleyvision.org/
Other
85 stars 20 forks source link

Could not find kernel fillbuffer_float #25

Closed psyhtest closed 8 years ago

psyhtest commented 8 years ago

Hi @naibaf7,

I've encountered the following issue when running the 00-classification.ipynb example on the opencl branch of BVLC/caffe:

I0330 20:51:48.994209 25161 common.cpp:475] OpenCL platform: ARM: OpenCL 1.1  does not work correctly.
ViennaCL: FATAL ERROR: Could not find kernel 'fillbuffer_float' from program ''
Number of kernels in program: 0
std::exception

I suspect this may be coming from src/caffe/greentea/greentea_math_functions.cpp:

void greentea_memset(const int_tp ctx_id, const uint_tp N, const int_tp alpha,
                     cl_mem X, const int_tp offX) {
  viennacl::ocl::context &ctx = viennacl::ocl::get_context(ctx_id);
  viennacl::ocl::program &program = (Caffe::Get().GetDevice(ctx_id, false))
      ->program();

  // OpenCL Version >= 1.2 approach
  // clEnqueueFillBuffer(ctx.get_queue().handle().get(),
  //  X, &alpha, sizeof(int_tp),
  //                     offX, N, 0, NULL, NULL);
  // OpenCL Version < 1.2 fallback
  typedef float Dtype;
  viennacl::ocl::kernel &oclk_fill = program.get_kernel(
      CL_KERNEL_SELECT("fillbuffer"));
  viennacl::ocl::enqueue(
      oclk_fill(static_cast<int_tp>(N), static_cast<unsigned char>(alpha),
                WrapHandle(X, &ctx), offX),
      ctx.get_queue());
}

I can't see CL_KERNEL_SELECT defined anywhere so I assumes it just vanishes leaving the "fillbuffer" kernel name. However, greentea/cl_kernels/fillbuffer.cl defines:

__kernel void TEMPLATE(fillbuffer,Dtype) ...
__kernel void TEMPLATE(fill,Dtype) ...

So should greentea_memset(...) above use "fillbuffer_float" rather than "fillbuffer" as the kernel name?

Many thanks!

naibaf7 commented 8 years ago

First, I notice OpenCL platform: ARM: OpenCL 1.1 does not work correctly. This hints the platform wasn't able to compile one or more of the kernels. The same goes for this message: Number of kernels in program: 0

Is there a way you can run the following test to figure out which kernels make your OpenCL implementation struggle? It needs to be compiled with "make runtest" first. ./build/test/test_all.testbin --gtest_filter=*OpenCLKernelCompileTest* 0 or ./build/test/test_all.testbin --gtest_filter=*OpenCLKernelCompileTest* 1

depending which device to use (if multiple are present).

We basically need to find out where the compile fails to provide a platform fix for your OpenCL. Sadly, the code often compiles on most platforms but then fails on some specific versions.

naibaf7 commented 8 years ago

@psyhtest Let me know if the issue persists with the latest version.

psyhtest commented 8 years ago

@naibaf7 This issue must have been related to https://github.com/naibaf7/caffe/issues/25. With that fix, I no longer see Number of kernels in program: 0. I still get std::exception from the 00-classification.ipynb example but I'll investigate where that happens separately. Many thanks.