This PR updates ResponseGLFlux and ResponseSurfaceVelocityMismatch so that they can be used in uvm-off GPU builds. This includes porting a number of evaluators for scattering and communicating responses to kokkos for device-friendly access.
Additionally, this PR also fixes some existing kokkos ported evaluators to avoid some unintended host accesses that showed up in a uvm-off build. I've also added evaluator names to as many of the parallel_for regions as I could find to aid in performance profiling and debugging.
After this PR is merged, the following tests will now be passing in a uvm-off build:
The next target for the uvm-free porting process is to port L2ProjectedBoundaryLaplacian to kokkos which will get a large number of additional tests to start passing. (I think this will get most everything remaining in FO_GIS to pass)
Finally, I've re-enabled all of the cuda tests for the uvm-free build and I will instead start an issue to track the status of which tests and expected to pass/fail with uvm disabled.
This PR updates
ResponseGLFlux
andResponseSurfaceVelocityMismatch
so that they can be used in uvm-off GPU builds. This includes porting a number of evaluators for scattering and communicating responses to kokkos for device-friendly access.Additionally, this PR also fixes some existing kokkos ported evaluators to avoid some unintended host accesses that showed up in a uvm-off build. I've also added evaluator names to as many of the
parallel_for
regions as I could find to aid in performance profiling and debugging.After this PR is merged, the following tests will now be passing in a uvm-off build:
The next target for the uvm-free porting process is to port
L2ProjectedBoundaryLaplacian
to kokkos which will get a large number of additional tests to start passing. (I think this will get most everything remaining in FO_GIS to pass)Finally, I've re-enabled all of the cuda tests for the uvm-free build and I will instead start an issue to track the status of which tests and expected to pass/fail with uvm disabled.