But I failed...
Firstly, I rewrote the code with Cupy rather than Numpy. However some functions used in Raisr, like gradient (maybe we can replace it with other apis available in Cupy? ), have not been realized in Cupy. I also tried to mix Numpy and Cupy. It works, but doesn't perform well. Maybe the exchange of data between GPU and CPU costs too much time.
Then, I tried Numba. Sadly its Cuda function does not support Numpy array methods. Beside, @jit does produce efficient machine code, but the promotion is limited.
To parallelize those loops, maybe we have to rewrite the code. If anyone wants to make Raisr GPU optimized on Python, contact me and we can work together.
In addition, I suggest the author to add the retraining feature by saving Q&V and other parameters. It would be useful and efficent, while eaiser to realize than GPU optimization.
But I failed... Firstly, I rewrote the code with Cupy rather than Numpy. However some functions used in Raisr, like gradient (maybe we can replace it with other apis available in Cupy? ), have not been realized in Cupy. I also tried to mix Numpy and Cupy. It works, but doesn't perform well. Maybe the exchange of data between GPU and CPU costs too much time. Then, I tried Numba. Sadly its Cuda function does not support Numpy array methods. Beside, @jit does produce efficient machine code, but the promotion is limited. To parallelize those loops, maybe we have to rewrite the code. If anyone wants to make Raisr GPU optimized on Python, contact me and we can work together.
In addition, I suggest the author to add the retraining feature by saving Q&V and other parameters. It would be useful and efficent, while eaiser to realize than GPU optimization.