gpu / JOCL

Java bindings for OpenCL
http://www.jocl.org
Other
187 stars 33 forks source link

opencl speed compared with c++ #20

Closed zcaudate closed 6 years ago

zcaudate commented 6 years ago

I'm curious to know if there are any benchmarks for operations in jocl as compared to using c++.

I've done my own benchmarking with the opencv ocl library and jocl on a custom image algorithm and found that there is an order of an magnitude difference between jocl and native ocl:

jocl, 0250px: 2.467 ms jocl, 0500px: 6.367 ms jocl, 1000px: 24.060 ms jocl, 2000px: 81.356 ms jocl, 4000px: 287.928 ms jocl, 8000px: 1031.693 ms

cv::ocl, 0250px: 0.475 ms cv::ocl, 0500px: 0.783 ms cv::ocl, 1000px: 1.632 ms cv::ocl, 2000px: 4.555 ms cv::ocl, 4000px: 15.846 ms cv::ocl, 8000px: 121.899 ms

Is there any reason for this?

gpu commented 6 years ago

Sorry for the delay here. I'm not entirely sure where this difference might come from, but would be curious about more details of the benchmark. Would it be possible to share some code (at least the JOCL-based part?) Or is the issue resolved now?

zcaudate commented 6 years ago

@gpu: this issue is related to workgroup size and is being looked at here: https://github.com/gpu/JOCL/issues/21

The cv::ocl version had a local worksize of [8 8] and jocl was [1 1].