Open wqinc opened 6 years ago
Matrix multiplications should be good. Matrix inversions might be trickier, depending on what you're doing. E.g. you can write a good panel update routine for the inner loop of Cholesky, but it's much harder to write an entire Cholesky decomposition in Halide.
There's a GPU real-valued matrix multiply with a good schedule here: https://github.com/halide/Halide/tree/master/apps/cuda_mat_mul
and here's an example of some hand-rolled syntactic sugar for dealing with complex values (it wraps Halide's Tuple support): https://github.com/halide/Halide/blob/master/apps/fft/complex.h
@wqinc What about using QR decomposition with Givens rotations for matrix "inversion"? The rotations can be computed on the rows in parallel.
New to this framework and we are investigating if Halide could be extended to solving matrix-intensive computational physics problems. Thank you for this great piece of work.
A silly question if you don't mind, could we use Halide to experiment with scheduling complex matrix inversions and multiplications (in a CPU-GPU heterogeneous setting)? I would really appreciate it if anyone could point me to some references as I don't seem to find any in the code/tutorials.