Exercises/kokkoskernels/CGSolve: Support half_t

e10harvey commented 3 years ago

Add support for cgsolve using half precision
Cleanup fences and code from original cgsolve

Depends on https://github.com/kokkos/kokkos/pull/3746. Related to https://github.com/kokkos/kokkos-kernels/issues/852.

lucbv commented 3 years ago

@e10harvey A couple questions, first it seems you only want this to run on Cuda platforms, is this a fair assumption or could users try this with OpenMP? In any case I think we should at least guard the code with #if defined(KOKKOS_ENABLE_CUDA)

Also I am a bit curious about the fact that we go from double to half and not something like double to float and float to half?

e10harvey commented 3 years ago

@lucbv

A couple questions, first it seems you only want this to run on Cuda platforms, is this a fair assumption or could users try this with OpenMP?

User's can try with OpenMP and half_t will fall back to float in that case. We run into compiler errors if both Cuda and OpenMP are enabled; I need to fix this use-case in the KokkosCore half precision support.

Also I am a bit curious about the fact that we go from double to half and not something like double to float and float to half?

Are you referring to this code: https://github.com/kokkos/kokkos-tutorials/blob/d7fda3361b34f697191e4a57f39e86dc259418e0/Exercises/kokkoskernels/CGSolve/Begin/cgsolve.cpp#L197-L203

Are you wanting to default to float instead of double as the ScalarType when half precision is not available?

e10harvey commented 3 years ago

@lucbv: Do you have any more feedback on this one?

e10harvey commented 3 years ago

Thanks, @mhoemmen and @lucbv. I added constexpr and an #ifdef KOKKOS_ENABLE_CUDA for setting DeviceType.

kokkos / kokkos-tutorials

Exercises/kokkoskernels/CGSolve: Support half_t #51