ysh329 / OpenCL-101

Learn OpenCL step by step.
131 stars 29 forks source link

Performance loss caused by input type of kernel function #5

Closed ysh329 closed 6 years ago

ysh329 commented 7 years ago

I found that it seems exits a performance loss using different type as input type for kernel function. Define two kernel functions, accomplishing same thing:

function1

__kernel void global_bandwidth_vec2(const int heightA, const int widthA, __global const float *a, __global float *b) {
    const int idx = get_global_id(0);
    const int step = idx << 1;

    CL_ELEM_TYPE value = *((__global CL_ELEM_TYPE *)(a + step));
    *((__global CL_ELEM_TYPE *)(b + step)) = value;
}

function2

__kernel void global_bandwidth_vec2(const int heightA, const int widthA, __global const CL_ELEM_TYPE *a, __global CL_ELEM_TYPE *b) {
    const int idx = get_global_id(0);
    const int step = idx << 1;

    CL_ELEM_TYPE value = *((__global CL_ELEM_TYPE *)(a + step));
    *((__global CL_ELEM_TYPE *)(b + step)) = value;
}

These two functions are same except input-variable type: function1 using float; function2 using MACRO (defined in clBuildProgram) instead.

ysh329 commented 7 years ago

I guess this performance loss is due to difference between oldCL/cl.h and now ones.

ysh329 commented 6 years ago

Temporarily closed.