__global__ void op_cuda_kernel2(const double * arg0, int *arg1, int set_size) { int arg1_l[1]; ... for (int d=0; d<1; d++) { op_reduction<OP_INC>(..., arg1_l[d]); } }
After compiling one of these, attempting to compile the second kernel fails with a type error:
.../OP2-Common/op2/c/include/op_cuda_reduction.h(51): error: declaration is incompatible with previous "temp" (51): here detected during instantiation of "void op_reduction<reduction,T>(volatile T *, T) [with reduction=3, T=int]"
To reproduce for yourself, pull the MG-CFD-app-OP2 repository, uncomment the op_reduction's in cuda/count_bad_vals_kernel.cu and cuda/calc_rms_kernel_kernel.cu, the compile 'mgcfd_cuda'
Suppose an application has two OP2-generated CUDA kernels. One performs a reduction of a double, the other performs a reduction of an integer:
__global__ void op_cuda_kernel1(const double * arg0, double *arg1, int set_size) { double arg1_l[1]; ... for (int d=0; d<1; d++) { op_reduction<OP_INC>(..., arg1_l[d]); } }
__global__ void op_cuda_kernel2(const double * arg0, int *arg1, int set_size) { int arg1_l[1]; ... for (int d=0; d<1; d++) { op_reduction<OP_INC>(..., arg1_l[d]); } }
After compiling one of these, attempting to compile the second kernel fails with a type error:
.../OP2-Common/op2/c/include/op_cuda_reduction.h(51): error: declaration is incompatible with previous "temp" (51): here detected during instantiation of "void op_reduction<reduction,T>(volatile T *, T) [with reduction=3, T=int]"
To reproduce for yourself, pull the MG-CFD-app-OP2 repository, uncomment the op_reduction's in cuda/count_bad_vals_kernel.cu and cuda/calc_rms_kernel_kernel.cu, the compile 'mgcfd_cuda'