Can't get the openCL acceleration to work

smjohnson-bi commented 5 years ago

I built a project with the latest V-HACD code and was try to speed up Hull generation by using the oclAcceleration option. However, the call to clEnqueueNDRangeKernel in VHACD.cpp returns CL_OUT_OF_RESOURCES.

I googled this problem and a post suggested reducing the local work size parameter. I reduced it from 4096 to 128 and then the call was successful ( no larger values would work ).

Running with this value however did not improve performance over not running GPU acceleration.

tobyndax commented 5 years ago

I don't know what type of hardware you are running on. OpenCL, could very well be running on your CPU's GPU-area, and it doesn't always provide much greater computing performance than the CPU, depending on the task. You running out of CL-resources hints at this. Did you try with reducing it by less, say 2048?

smjonson commented 5 years ago

Nothing above 128 seemed to work and I've tried 3 different machines. Are the platform id and device id parameters significant? Also I compiled the code with OPENCL_FOUND and CL_VERSION_1_1 as pre-processor defines but there are others used in the code e.g. OCL_SOURCE_FROM_FILE, _OPENMP, USE_SSE. Which would you recommend?

tobyndax commented 5 years ago

Hi smjonson, I haven't looked into opencl with VHACD myself yet, so I don't have any specific knowledge for you (unless I sit down and investigate closer). But for completeness of information (if someone else has a look at the issue) I think you should post a copy of the output from clinfo.

Windows: Open cmd > clinfo

smjohnson-bi commented 5 years ago

Ok thanks, here is my openCL configuration:

Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.1 WINDOWS Platform Name: Intel(R) CPU Runtime for OpenCL(TM) Applications Platform Vendor: Intel(R) Corporation Platform Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_dx9_media_sharing cl_intel_dx9_media_sharing cl_khr_d3d11_sharing cl_khr_gl_sharing cl_khr_fp64 cl_khr_image2d_from_buffer cl_intel_vec_len_hint

Platform Name: Intel(R) CPU Runtime for OpenCL(TM) Applications Number of devices: 1 Device Type: CL_DEVICE_TYPE_CPU Device ID: 32902 Max compute units: 6 Max work items dimensions: 3 Max work items[0]: 8192 Max work items[1]: 8192 Max work items[2]: 8192 Max work group size: 8192 Preferred vector width char: 1 Preferred vector width short: 1 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Max clock frequency: 3700Mhz Address bits: 14757395255531667488 Max memory allocation: 536838144 Image support: Yes Max number of images read arguments: 480 Max number of images write arguments: 480 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 480 Max size of kernel argument: 3840 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: Read/Write Cache line size: 64 Cache size: 262144 Global memory size: 536838144 Constant buffer size: 131072 Max number of constant args: 480 Local memory type: Global Local memory size: 32768 Error correction support: 0 Profiling timer resolution: 100 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities:
Execute OpenCL kernels: Yes Execute native function: Yes Queue properties:
Out-of-Order: Yes Profiling : Yes Platform ID: 00031F0C Name: Intel(R) Core(TM) i5-9600K CPU @ 3.70GHz Vendor: Intel(R) Corporation Driver version: 18.1.0.0920 Profile: FULL_PROFILE Version: OpenCL 2.1 (Build 0) Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_dx9_media_sharing cl_intel_dx9_media_sharing cl_khr_d3d11_sharing cl_khr_gl_sharing cl_khr_fp64 cl_khr_image2d_from_buffer cl_intel_vec_len_hint

kmammou / v-hacd

Can't get the openCL acceleration to work #66