Open rheacangeo opened 4 years ago
We discussed this in a team meeting and decided that while this would be quite useful for certain vertical solve patterns, there are other open issues and feature requests that will have much higher impact for the work required (for example, directional offsets).
So, I suggest we do not include this in the effort for the v1.0 release, but also not close the issue.
Something along these lines would be extremely helpful for the microphysics, which uses a double-k-loop to handle sedimentation melting, removing ice/snow/graupel from one k-level and adding it as rain to the appropriate lower levels while updating the heat capacities and temperatures along the way:
for k in range(ke, ks-1,-1):
if v_terminal[k] >= 1.e-10:
continue
if qice[k] > constants.QCMIN:
for m in range(k+1, ke+1):
if zt[k+1] >= ze[m]:
break
if (zt[k] < ze[m+1]) and (temperature[m] > constants.TICE):
cvm[k] = moist_heat_capacity(qvapor[k], qliquid[k], qrain[k], qice[k], qsnow[k], qgraupel[k])
cvm[m] = moist_heat_capacity(qvapor[m], qliquid[m], qrain[m], qice[m], qsnow[m], qgraupel[m])
dtime = min(timestep, (ze[m] - ze[m+1])/v_terminal[k])
dtime = min(1., dtime/tau_mlt)
sink = min(qice[k] * delp[k] / delp[m], dtime * (temperature[m] - constants.TICE) / icpk[m])
qice[k] -= sink * delp[m] / delp[k]
if zt[k] < zs:
r1 += sink*delp[m]
else:
qrain[m] += sink
temperature[k] = (temperature[k] * cvm[k] - li00 * sink * delp[m] / delp[k]) / moist_heat_capacity(qvapor[k], qliquid[k], qrain[k], qice[k], qsnow[k], qgraupel[k])
temperature[m] = (temperature[m] * cvm[m]) / moist_heat_capacity(qvapor[m], qliquid[m], qrain[m], qice[m], qsnow[m], qgraupel[m])
if qice[k] < QCMIN:
break
Should the last update be to temperature[m]
? Like this it looks like the second last update will be overwritten. Or is another index wrong?
No, that's correct it should be temperature[m]
, fixed now
I haven't found a more proper way to do this, but this thing from the darkest corners of my workaround brain might just work, if this is just an isolated instance that needs to be in gt4py at all cost:
from gt4py.cartesian.gtscript import Field, IJ
Nk = 80
fake_2d_field = Field[IJ, (np.float64, (Nk,))]
actual_2d_int_field = Field[IJ, np.int32]
k_index_field = gt4py.storage.full(shape=(Ni, Nj), fill_value=Nk,dtype=np.int32, dimensions=["I","J"], backend=...)
m_index_field = gt4py.storage.zeros(shape=(Ni, Nj), dtype=np.int32, dimensions=["I","J"], backend=...)
qice = gt4py.storage.from_array(data,shape=(Ni, Nj, Nk), dtype=np.int32, dimensions=["I","J", "0"], backend=...)
def nested_loop_computation(v_terminal: fake_2d_field,
qice: fake_2d_field,
zt: fake_2d_field,
ze: fake_2d_field,
k_index_field: actual_2d_int_field,
m_index_field: actual_2d_int_field,
# etc
):
with computation(PARALLEL), interval(0, 1): # order doesn't matter since it's only one layer
while k_index_field > 0:
m_index_field = k_index_field
while m_index_field < Nk:
# do the computation, where accesses to fields are like this:
# qice[0,0][k_index_field] -= sink * delp[0,0][m_index_field] / delp[k_index_field]
m_index_field += 1
k_index_field -= 1
So the idea is to not use the vertical capability of stencils at all but rather emulate loops with whiles in data dimensions.
Could something like this work?
If in the end, DaCe orchestration is the goal, another solution would be to write a NumPy code and directly use @dace.program
which will integrate nicely into the DaCe orchestrated context.
@oelbert
r1
?Currently this is just a Python function to be dace-optimized, though I do have a standalone test for it. I've been serializing data from Fortran runs and checking the coverage in Python. It works, but it does feel awkward to have the model be in GT4Py except for this one piece of code that's not. I think Linus' workaround could solve putting this into stencil, though.
As far as splitting the heat capacity and temperature updates from the sedimentation melting, that would be ideal except for the sink = min(qice[k] * delp[k] / delp[m], dtime * (temperature[m] - constants.TICE) / icpk[m])
line, where the amount of ice taken from level k depends in part on the temperature at level m, which updates through the loop.
r1
is a 2D field that tracks the rain that's precipitated to the ground through the sedimentation scheme.
Some column based atmospheric science algorithms are intuitive to scientists to write in such a way that adjacent vertical layers are modified in conjunction. For example, if you take mass from the layer you are at and add it to the one below you, it is intuitive to write:
Without off-center writes, we need to write it something like
This is not terrible, and makes sense with this small example, but it quickly can get confusing with more complex patterns. The special handling at the top and bottom can be easy to forget to do. Here is a sample we ported:
Python loop version
Stencil version avoiding offcenter writes-introduce fields upper_fix and lower_fix:
In the stencil version, it's hard to tell what's going on. The examples can get more complex. I think that unrestricted off-center writes might create a lot of problems, but this adjacent vertical adjustments pattern comes up a few times in the FV3 dycore and physics. I could see something like off-center only being allowed in the vertical direction, and enforce the computation cannot be PARALLEL if doing an off-center write being some restrictions that'd need to be made.