stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
103 stars 24 forks source link

Global reductions need to work with the correct MPI communicator #2127

Open mike-hobson opened 1 year ago

mike-hobson commented 1 year ago

In the future, LFRic will need to run multiple simultaneous model instances (eg. ensembles) within a single LFRic executable. To do this, each ensemble member will have to run on its own set of processors – so will have its own MPI communicator. That means when PSyclone adds code for global reductions, it will have to know which MPI communicator to do the reduction over.

At the moment, PSyclone generates something like the following when a global sum built-in is invoked:

TYPE(scalar_type) global_sum ... sum_val = 0.0_r_def DO df=loop1_start,loop1_stop sum_val = sum_val + field_proxy%data(df) END DO global_sum%value = sum_val sum_val = global_sum%get_sum()

But in the future, it will have to query the field, to ask it what communicator that field is built on, before calling the global sum routine using that communicator – so the code will become:

TYPE(scalar_type) global_sum TYPE(mpi_type) mpi ... mpi = field_proxy%get_mpi() sum_val = 0.0_r_def DO df=loop1_start,loop1_stop sum_val = sum_val + field_proxy%data(df) END DO call global_sum%initialise(sum_val, mpi) sum_val = global_sum%get_sum()

The LFRic infrastructure for the new global sum code, including querying the field for its communicator, is already on trunk. The infrastructure currently supports both examples above, but it would be good to future proof us for when we are running with multiple communicators by moving to the form of the second example.

rupertford commented 1 year ago

Thanks @mike-hobson, could you advise how mpi_type is included. I'm assuming it is an LFRic type?

mike-hobson commented 1 year ago

Sorry about that. Not sure how I managed to leave off an important piece of information like that. Yes, it is an LFRic type and needs to be included with: use mpi_mod, only: mpi_type