STEllAR-GROUP / blaze_cuda

WIP · CUDA compatibility for Blaze · https://bitbucket.org/blaze-lib/blaze
17 stars 3 forks source link

blaze::CUDAReduce - Inaccuarte results for large CUDADynamicVector #8

Closed JPenuchot closed 5 years ago

JPenuchot commented 5 years ago

blaze::CUDAReduce doesn't work for large sizes. I've been unable to find the source of the bug for days now and I'm running out of ideas.

Above a certain threshold the CUDA reduce kernel (the __global__ function) will start outputting inaccurate values. I've been trying to pinpoint the issue, to add synchronization directives but nothing seems to help.

JPenuchot commented 5 years ago

Examples are on the way, I'll tidy them up into unit tests.

JPenuchot commented 5 years ago

Test is located in blazetest/utiltest/algorithms/cuda_reduce.h

To run it:

cd blazetest
make src/utiltest/algorithms/cuda_reduce.run

CUDATransform seems to be broken too, I'll have a look into it

JPenuchot commented 5 years ago

Fixed