Closed pelesh closed 7 months ago
@kswirydo, consider implementing FWHT kernels in a similar way we implemented VectorKernels (i.e. the vector kernel; we have only one here :)). The downside of this approach is we can build only one backend per build (i.e. cannot build HIP and CUDA at the same time).
Alternatively, we could use similar approach we used for handlers. That would be a little more work, but would be a more flexible solution.
Addressed in #147
Prototypes for GPU kernel wrappers should not depend on vendor-specific compiler. See comment below:
I don't think we should manage vendor specific includes at this level. The random sketching function should be hardware agnostic.
_Originally posted by @pelesh in https://github.com/ORNL/ReSolve/pull/83#discussion_r1408081274_