bartgol commented 4 years ago

This issue is to track stuff in the AD related to grid remapping that needs to be fixed or completely re-designed.

As of now (10/29/20), I can think of a few things that need to be addressed/improved/fixed:

[ ] In AtmosphereProcessGroup (APG), do not build the input remapper for the first Atmosphere Process (AP) nor the output remapper for the last AP. Whomever is encompassing the APG (be that the AD or another APG) will take care of creating those remappers. Right now, we remap in/out fields before/after calling each individual AP. This is ok (meaning not incorrect), but might be inefficient. Say we have the group (A,B) subcycled twice, meaning the exec flow is (A->B)->(A->B) (i.e., a group G=(G1,G2), with G1=G2=(A,B)). If A,B are on the same grid, but not on the ref grid, we don't want to remap the output of B inside G1 (it'd be a waste of a remap).
[ ] related to the one above, in an atm group, we should not remap back to the ref grid if all procs in the group are on the same grid. If we have G=(A,B), with both A and B not on ref grid, we can wait to remap A's output to the ref grid until after B. That would allow us to have a single remapper, which might allow to amortize some MPI calls. NOTE: right now, the only non-ref grid is Dynamics, making this point a bit moot. But as soon as we implement parallel splitting, we might have many/all processes not running on the ref grid (which is the grid the coupler uses), because of dofs redistribution. Say we do Dynamics in parallel with G=(P3,SHOC); then none of these procs is on the ref grid (unless we bless the phys grid on half of the atm procs as "ref grid, but that might cause issues with the component coupler).
[x] we currently have a problem if we can build a FieldInitializer for some field not on the ref grid. Say Dynamics inits field F on the dyn grid, but the ref grid is the phys grid. the real input to the atm dag is F on the phys grid, and nobody initializes it, causing a dag error. Worse: even if we bypass the dag, after Dyn inits F on dyn grid, the input remapper for Dyn will overwrite the value of F on dyn grid (copying from F on phys grid). Possible solutions:
- if a FieldInitializer inits field F on a grid that is not the ref grid, force it to init also the copy of F on the ref grid. This works so long as the remapping is invertible (such as pd remapping). For some interpolation remapping, this won't work ( R(Rinv(x))!=x ).
- if we implemented the first item above, we would have the AD building an input remapper for the full APG (the whole atm). We could tell the AD to not run this remapper on the 1st timestep, assuming some flag is set (FieldInitializer is meant to help with standalone tests, were we don't get initialization from I/O or the coupler; CIME runs would not need any of this).
[ ] The remapping is currently handled inside APG. Given the need to create required/computed fields ids before we get the actual field (which are allocated after), we had to split the registration and binding phase of fields in the remapper (i.e., you can register just the FieldIdentifier, and bind the actual field later). If we implement some of the ideas above, I think there is a chance we can simplify how we create/setup the remappers. In particular, I think we might be able to delay output remapping.

AaronDonahue commented 4 years ago

@jeff-cohere ^^

jeff-cohere commented 4 years ago

I think I understand the first couple of issues. How "automatic" is the remapping process? If it's something that is currently being done for us "behind the scenes," maybe there's a case to be made for just providing remapping tools and having an atm model define its remapping process explicitly in terms of a given (parameterized, perhaps) set of subcycling strategies. Given the set of rules/constraints we're accumulating, I think maybe "constructing" a remapping process with these tools might be more clear than beginning with a "default" remapping process and removing things from it.

Does that make any sense? I haven't looked at code, so maybe I'm just blowing hot air.

jeff-cohere commented 3 years ago

Regarding "packing/unpacking" (to support, e.g. tracers to be advected by dynamics): it seems like we might need a PackedField type of rank 2 to store fields that we want to combine. Such a packed field consists of any set of fields (identified by their "group", specified in the appropriateFieldTracking objects) that have been registered to be packed into a PackedField with a specific field identifier. One might use a method like this one to register all fields in a specific group to be packed thus:

   // Specify that all source fields belonging to the group identified by the
   // given case-insensitive string be packed into a "packed" field (of rank 2)
   // with the given target identifier. In the inverse mapping, these fields will
   // be unpacked from the source field into their target fields in the same
   // order in which they were packed.
   void register_packing(const ci_string& group_name,
                         const identifier_type& packed_tgt);

Then, of course we'd need to add a PackedField type, and add interface methods for retrieving such packed fields to the AbstractRemapper class. Does this sound correct? I don't see how we can avoid defining a rank 2 field type for this kind of packing.

One might worry about using the same identifier_type to refer to Field objects and PackedField objects, which raises the question: how far do we want to go to support a higher-rank field type? I can see this satisfying a few special cases, so personally I don't think we should get too carried away with recreating all of the field constructions for packed fields.

bartgol commented 3 years ago

You would still need a call to register_field (and possibly bind_field) after that, though, right? I think what you suggest seems good. We need to find a way to enforce that all packing info is registered before any field is (I guess a simple assertion inside register_packing to check that no fields have been register yet, would do).

As for generality, I think we only have one use case in mind, so let's make this work for that case only, for now. However, thinking about it, the tracers might have a rank-3 packing (if I understand correctly your language). I say "might", cause I don't know what physics need. Homme stores qdp (tracers mass) with time slices, while q (tracers concentration) is "diagnostic". So their layouts are

qdp: (num_elems, num_q_time_levels, num_tracers, num_gp, num_gp, num_lev)
q  : (num_elems,                    num_tracers, num_gp, num_gp, num_lev)

Depending on what physics want, we might need to implement different things. If physics need qdp, we might have to do something that does packing in addition to time slice selection (which is what dynamics states already have to do).

I'm assuming physics only uses q, and can compute qdp if need be, but I'm not sure. @AaronDonahue ?

AaronDonahue commented 3 years ago

@bartgol , yes, you are correct. Currently physics only uses q. qdp is fully a dynamics variable. I think we can take advantage of this and keep things as simple up front. If, for some reason, someone decides they want physics to use qdp for something then that would likely be a mini-project of its own and they can cross that bridge then.

bartgol commented 3 years ago

@AaronDonahue great! I think I registered qdp and not q in the FM in HommeDynamics. q is twice as small, so that would save some device memory going forward.

@jeff-cohere Given what Aaron said, I think your plan sounds good.

PeterCaldwell commented 3 years ago

@bartgol - I think this issue may be made obsolete by the decision that the AD only handles physics data and individual procs (like dynamics) are responsible for remapping if they want to? Is this true? If so, can we close this?

bartgol commented 3 years ago

@PeterCaldwell you catch me off guard, I don't think we decided in that direction. It was my understanding that we still want the remapping to be handled by the AD infrastructure. Yes, as of now the only "real" remapping is needed by dynamics. However, when we will add the parallel splitting of processes, things will get a bit more complicated. I believe that the remap infrastructure will come handy then. With remapping handled by the AD, each atm process does not need to handle both sequential/parallel splitting cases.

That said, we can of course always decide to go the route of "everything is on the physics grid, and if you need something else, you handle the remapping". That would have important consequences, which may simplify the code the AD code, but might make the writing of parametrization interfaces a bit more complicated..

PeterCaldwell commented 3 years ago

@bartgol - huh, your being off guard by me suggesting procs handle their remapping catches me off guard. Let's talk about this during tomorrow's AD telecon. I'm fine with keeping remapping within the AD, I just thought we decided to do the opposite.

PeterCaldwell commented 3 years ago

This issue is related to #159 #780... maybe we don't need all of these issues?

bartgol commented 3 years ago

I guess this was more like a list of things to do along with some big picture discussion, while each issue you linked focused on a particular one. I'm ok closing this, in favor of targeted issues. If we need some big picture discussion, we can create a page on confluence maybe (or use the (soon to come) discussions feature of github).

bartgol commented 3 years ago

942 made this issue completely irrelevant now.

E3SM-Project / scream

Grid remapping outstanding issues #709

942 made this issue completely irrelevant now.