Cuda cannot be fully dropped

I open this as an issue because I was thinking that, given OpenCL is the only thing that works in all devices, we should drop cuda altogether.

However, according to this paper https://arxiv.org/pdf/1005.2581.pdf Cuda is notably better even when using basically the same code. So I would suggest focusing in OpenCL until everything is working and then porting the final version to Cuda for comparison.

(opening this as an issue mainly to have a reference and not forget about it)

N3PDF / mcgpu

Cuda cannot be fully dropped #12