Open brian-oneill opened 1 week ago
Curious why you test Perlmutter GPU with nvidia instead of gnu (which is what E3SM and SCREAM test with).
Curious why you test Perlmutter GPU with nvidia instead of gnu (which is what E3SM and SCREAM test with).
@rljacob , I think we want Omega to support both NVIDIA and GNU compilers on Perlmutter GPU nodes. Since Perlmutter uses NVIDIA GPUs, I think that we typically test Omega with the NVIDIA compiler first and the GNU compiler next. However, if E3SM and SCREAM consider GNU as the primary compiler on Perlmutter, we may also adopt the same compiler preference.
We have not actually done a performance comparison between nvidia, gnu and intel on perlmutter gpus. But we have seen nvidia have trouble with some of the Fortran code. gnu is preferred unless there is evidence another one is better.
Confirmed that PR passes unit tests on Frontier with cpu and gpu. Since the unit tests show this is working correctly because they create arrays with unique values per cell, do a halo exchange, and then compute the error.
Awaiting timing tests from Kieran Ringel for performance comparison between this new halo exchange on device and the previous halo exchange on host (gpu versus cpu).
Timing results for this PR with OMEGA_MPI_ON_DEVICE turned on and off (indicted in second half of name in the legend)
Timing results for this PR with OMEGA_MPI_ON_DEVICE turned on and off (indicted in second half of name in the legend)
@kieran-ringel Note that depending on what exactly you measured this bug https://github.com/E3SM-Project/Omega/pull/163#discussion_r1855850029 might have affected these results, since it causes state and tracer halo exchanges to exchange host arrays only.
Updating timing with updated exchange device arrays in State and Tracer exchangeHalo functions
This PR updates the Halo to allow for halo exchanges of arrays allocated in device memory space as well as host memory space. With this update, Omega can take advantage of GPU aware MPI implementations.
Changes include:
Successfully built and passed unit tests with OMEGA_MPI_ON_DEVICE both on and off on Chrysalis (intel), Perlmutter CPU (intel) & GPU (nvidiagpu), and Frontier CPU (crayclang) & GPU (crayclanggpu)
Checklist