Open gdicker1 opened 1 month ago
NOTE: the changes in this PR slightly change answers. The changes seem to come from a couple of loops. In my regional test case, if the calculations of rho_p
, rho_zz
, rtheta_p
, exner
, and pressure_p
are done on the CPU instead, no differences occur.
First this loop
do k = 1, nVertLevels
rho_p(k,iCell) = rho_p_save(k,iCell) + rho_pp(k,iCell)
rho_zz(k,iCell) = rho_p(k,iCell) + rho_base(k,iCell)
end do
And in this loop
do k = 1, nVertLevels
rtheta_p(k,iCell) = rtheta_p_save(k,iCell) + rtheta_pp(k,iCell) &
- dt * rho_zz(k,iCell) * rt_diabatic_tend(k,iCell)
theta_m(k,iCell) = (rtheta_p(k,iCell) + rtheta_base(k,iCell))/rho_zz(k,iCell)
exner(k,iCell) = (zz(k,iCell)*(rgas/p0)*(rtheta_p(k,iCell)+rtheta_base(k,iCell)))**rcv
! pressure_p is perturbation pressure
pressure_p(k,iCell) = zz(k,iCell) * rgas * (exner(k,iCell)*rtheta_p(k,iCell)+rtheta_base(k,iCell) &
* (exner(k,iCell)-exner_base(k,iCell)))
end do
Although there isn't anything in particular that looks suspect to me, it does seem like the differences in results after the first timestep of a 12-km regional simulation comparing with the current develop
branch are a bit larger than I would have expected (based on the differences we see before and after the merge of #1216).
I also get a core dump at the end of the simulation on Derecho when using the following modules:
module --force purge
ml ncarenv/23.09
ml craype/2.7.23
ml nvhpc/24.3
ml ncarcompilers/1.0.0
ml cray-mpich/8.1.27
ml parallel-netcdf/1.12.3
ml cuda/12.2.1
It may be worth taking a second look at the changes in this PR to see if we can find anything that's subtly off.
@mgduda, this is ready for another round of review!
There are slight answer differences, but isolated just to exner
and pressure_p
from what I can tell. No answer differences occur if the code below is added after the loop to calculate rtheta_p
, theta_m
, exner
, and pressure_p
(near line 2892 in this PR).
! Calculate exner and pressure_p on CPU
!$acc update host(rtheta_p)
do iCell=cellStart,cellEnd
do k = 1, nVertLevels
exner(k,iCell) = (zz(k,iCell)*(rgas/p0)*(rtheta_p(k,iCell)+rtheta_base(k,iCell)))**rcv
! pressure_p is perturbation pressure
pressure_p(k,iCell) = zz(k,iCell) * rgas * (exner(k,iCell)*rtheta_p(k,iCell)+rtheta_base(k,iCell) &
* (exner(k,iCell)-exner_base(k,iCell)))
end do
end do
!$acc update device(exner,pressure_p)
This PR enables GPU calculation of post-acoustic step variable reconstitution by adding OpenACC directives to the
atm_recover_large_step_variables_work
routine.Timing information for the OpenACC data transfers in this routine is captured in the log file by a new timer:
atm_recover_large_step_variables [ACC_data_xfer]
Invariant fields used in the work routine are copied in during
mpas_atm_dynamics_init
and deleted inmpas_atm_dynamics_finalize
by building off the fields already handled in previous PRs.