ammarhakim / gkylzero

Lowest, compiled layer of Gkeyll infrastructure.
MIT License
22 stars 4 forks source link

[DR] Hierarchical Communicator Design #416

Closed ammarhakim closed 1 week ago

ammarhakim commented 4 months ago

Prototype/Production Branches

Please see the branch multib-app for work on communicators. This is the same branch in which the multi-block work is happening as these two issues are connected. I consider this to be a production branch

Introduction

Communicators need to be arrange hierarchically to reflect the hierarchy in the App system:

Hence, a refactoring of the existing comm system is needed to support all these various scenarios.

Principles

Proposed Design

I propose that we add a new set of methods that work on inter-block communications. The design needs some thought: should this be a new set of objects built on top of the existing comm objects, or just added to the present comm object. The latter has the advantage that we already have communicators used extensively everywhere in the code and so no need things will be needed. The disadvantage is that the IFCs become pretty fat.

We should aim for hierarchical methods. For example, a all-reduce on a communicator that is topology aware should call the block-level all-reduce and then do what is needed to all-reduce based on the topology. This frees up the user from worrying about how to do these operations himself.

One code implementation option is to introduce more structure into the gkyl_comm structure: perhaps we put the function pointers in nested structs for cleaner code, though this is only a cosmetic thing (perhaps not).

ammarhakim commented 3 months ago

I have performed some major surgery to the comm interfaces. This cleans up the code significantly, removing all internal details to private headers not accessible to a user. This is achieved by an additional level of indirection. The gkyl_comm object now simply looks like:

struct gkyl_comm {
  char id[128]; // string ID for communcator
  bool has_decomp; // flag to indicate if comm has an associated decomp
  const struct gkyl_rect_decomp *decomp; // decomp, if it exists, or NULL

  struct gkyl_ref_count ref_count; // reference count
};

No function pointers are in this struct. Instead a new gkyl_comm_priv struct holds them. As the name indicates this struct is private and not visible to the user. This greatly cleans up the main gkyl_comm.h header as all nasty function typedefs are not leaked to the user now. (We need to do this across the code, I believe).

ammarhakim commented 1 week ago

This DR is completed and can be closed. Some more work remains in inter-block comm but the work that was proposed here is completed.