Support halfN precision for GPU and CPU

ysh329 / OpenCL-101

Learn OpenCL step by step.

131 stars 29 forks source link

Support halfN precision for GPU and CPU #2

Closed ysh329 closed 7 years ago

ysh329 commented 7 years ago

GPU cl_khr_fp16: correct rate is wrong for HalfN when N is bigger than 1;
CPU fp16: segmentation fault when bigger than 128*128, such as 256*256.

Besides, about data_size variable, should I define data_size variables respectly for CPU and GPU? if using same data_size variable for different CPU-type or GPU-type (such as float cpu , half gpu), does it cause error?

ysh329 commented 7 years ago

Note:

half type of OCL on host: cl_half, don't support cl_halfN;
half type of OCL on device: half or halfN, don't support cl_half or cl_halfN.

Besides, when using half type in device, please ensure your host use half type (such as __fp16, so as to keep data_size of result variable from cpu and gpu are matched) too!