I think this is a 32-bit 64-bit mismatch problem. If so, how could I modify viennacl to fit with 32-bit machine?
But I cannot be sure, because long is 32-bit in 64-bit MSVC. I don't know if these tests work on Windows.
Number of platforms 1
Platform Name ARM Platform
Platform Vendor ARM
Platform Version OpenCL 1.1 v1.r6p0-02rel0.5f4d27da6fc56b891d186d95519b9e53
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint
Platform Extensions function suffix ARM
Platform Name ARM Platform
Number of devices 1
Device Name Mali-T760
Device Vendor ARM
Device Vendor ID 0x7500001
Device Version OpenCL 1.1 v1.r6p0-02rel0.5f4d27da6fc56b891d186d95519b9e53
Driver Version 1.1
Device OpenCL C Version OpenCL C 1.1 v1.r6p0-02rel0.5f4d27da6fc56b891d186d95519b9e53
Device Type GPU
Device Available Yes
Device Profile FULL_PROFILE
Max compute units 4
Max clock frequency 5MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Compiler Available Yes
Preferred work group size multiple 4
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 8 / 8 (cl_khr_fp16)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 2109706240 (1.965GiB)
Error Correction support No
Max memory allocation 527426560 (503MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 262144 (256KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max 2D image size 65536x65536 pixels
Max 3D image size 65536x65536x65536 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Global
Local memory size 32768 (32KiB)
Max constant buffer size 65536 (64KiB)
Max number of constant args 8
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) ARM Platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [ARM]
clCreateContext(NULL, ...) [default] Success [ARM]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name ARM Platform
Device Name Mali-T760
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name ARM Platform
Device Name Mali-T760
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name ARM Platform
Device Name Mali-T760
I am using
RK3288
, withARM-v7l
andMali-T760 MP4
. It is 32-bit armhf platform, with driver supporting OpenCL 1.11. Test
matrix_convert-test-opencl
throws exception. If there appearslong
.clEnqueueNDRangeKernel
load kernelassign_cpu
, returns(0) CL_SUCCESS
. Then,clGetEventInfo
onCL_EVENT_COMMAND_EXECUTION_STATUS
, get(-59) CL_INVALID_OPERATION
. The next operationclEnqueueWriteBuffer
return(-14) CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST
.Other tests, without
long
, such asconvert double to unsigned int
, works fine.vector_convert-test-opencl
,matrix_row_int-opencl
, etc. have the same problem.2. I tried to change code as like this:
template<> struct type_to_string<long> { static std::string apply() { return "int"; } };
Then got error from
matrix_convert-test-opencl
, begins with:I tested length of types:
I think this is a 32-bit 64-bit mismatch problem. If so, how could I modify viennacl to fit with 32-bit machine? But I cannot be sure, because
long
is 32-bit in 64-bit MSVC. I don't know if these tests work on Windows.Platform infomation:
Platfrom infomation by clinfo: