OP-DSL / OP2-Common

OP2: open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs
https://op-dsl.github.io
Other
98 stars 46 forks source link

op_cuda_reduction.h::op_reduction() only supports one type per application #155

Closed aowenson closed 5 years ago

aowenson commented 5 years ago

Suppose an application has two OP2-generated CUDA kernels. One performs a reduction of a double, the other performs a reduction of an integer:

__global__ void op_cuda_kernel1(const double * arg0, double *arg1, int set_size) { double arg1_l[1]; ... for (int d=0; d<1; d++) { op_reduction<OP_INC>(..., arg1_l[d]); } }

__global__ void op_cuda_kernel2(const double * arg0, int *arg1, int set_size) { int arg1_l[1]; ... for (int d=0; d<1; d++) { op_reduction<OP_INC>(..., arg1_l[d]); } }

After compiling one of these, attempting to compile the second kernel fails with a type error:

.../OP2-Common/op2/c/include/op_cuda_reduction.h(51): error: declaration is incompatible with previous "temp" (51): here detected during instantiation of "void op_reduction<reduction,T>(volatile T *, T) [with reduction=3, T=int]"

To reproduce for yourself, pull the MG-CFD-app-OP2 repository, uncomment the op_reduction's in cuda/count_bad_vals_kernel.cu and cuda/calc_rms_kernel_kernel.cu, the compile 'mgcfd_cuda'

aowenson commented 5 years ago

Thanks Istvan!

gihanmudalige commented 5 years ago

Now fixed in #156