CASA-POP interface should only pass information that is needed

ccarouge commented 2 months ago

For the moment, we pass full TYPE structures around and that includes more than what is needed.

Need to review to pass around only the information necessary.

Required for the tile mapping.

ACCESS-NRI

har917 commented 2 months ago

The following set of comments provides a bit more information on this technical task.

The POP and POPLUC sub-models are written assuming that CABLE and CASA will operate on the same land points as POP - i.e. there is only ever (up to) 3 tiles per grid cell. Technically this assumption is manifest by the passing of entire CABLE/CASA TYPES into SUBROUTINEs and expecting that the land points align in all the variables. It is also inefficient in that large numbers of variables are being passed around for no purpose.

For example SUBROUTINE LUCdriver( casabiome, casapool, casaflux, POP, LUC_EXPT, POPLUC, veg) passes the entirety of 7 TYPES around whereas only 7 fields in the CASA and veg TYPES are actually used.

For ACCESS3 we are likely to want to break this assumption - with CABLE/CASA operating with (up to) 17 (or perhaps 27) tiles per grid cell whereas POP retaining it's 3 tiles per grid cell structure. There are two high-level approaches to deal with this - Both of which require technical work within those routines that share CABLE/CASA and POP TYPES.

create an additional set of CASA/CABLE TYPES but allocate them with only (up to) 3 tiles per grid cell* - use the relevant (3 or 17/27) instance of the TYPES in the call to each routine. create new routines to synchronise the two instances of the TYPES.
create (temporary) fields of the correct size for the linking variables only - create new routines to pass information from CABLE/CASA to the temporary field which deal with the different use cases. Use the temporary variable in the call sequence.

Considering the LUCdriver case again - at the moment a broad overview of the CABLE-POP_TRENDY run sequence goes

USE cable_types, ONLY :: veg
USE casa_types, ONLY :: casapool, casabiome, casaflux
USE POP_types, ONLY :: POP
USE POPLUC_types, ONLY :: POPLUC, LUC_EXPT
initialise_cable(mp(ntile=3),veg,...) then initialise_casa(mp(ntile=3),casabiome,casapool,casaflux,...)
initialise_POP(mp(ntile=3),POP) , initialise_POPLUC(mp(ntile=3),POPLUC), initialise_LUC_EXPT(mp(ntile=3),LUC_EXPT)
CALL SUBROUTINE LUCdriver( casabiome, casapool, casaflux, POP, LUC_EXPT, POPLUC, veg)

Option 1 would look something like

USE cable_types, ONLY :: veg3, veg17
USE casa_types, ONLY :: casapool3, casabiome3, casaflux3, casapool17, casabiome17, casaflux17
USE POP_types, ONLY :: POP
USE POPLUC_types, ONLY :: POPLUC, LUC_EXPT
initialise_cable(mp(ntile=3),veg3,...) then initialise_casa(mp(ntile=3),casabiome3,casapool3,casaflux3,...)
initialise_cable(mp(ntile=17),veg17,...) then initialise_casa(mp(ntile=17),casabiome17,casapool17,casaflux17,...)
initialise_POP(mp(ntile=3),POP) , initialise_POPLUC(mp(ntile=3),POPLUC), initialise_LUC_EXPT(mp(ntile=3),LUC_EXPT)

then either for TRENDY

CALL SUBROUTINE LUCdriver( casabiome3, casapool3, casaflux3, POP, LUC_EXPT, POPLUC, veg3)

or for ACCESS

CALL map17to3(veg17,veg3,casabiome17,casabiome3,casapool17,casapool3,casaflux17,casaflux3)
CALL SUBROUTINE LUCdriver( casabiome3, casapool3, casaflux3, POP, LUC_EXPT, POPLUC, veg3)
CALL map3to17(veg17,veg3,casabiome17,casabiome3,casapool17,casapool3,casaflux17,casaflux3)

The advantages of this approach are that the code within the LUCdriver should be entirely unchanged and single definition points.

The disadvantages are i) efficiency/size of memory, ii) different CALLs depending on USE case, iii) likelihood that widespread tweaks will be needed because the same TYPE definition is used with different sizes, and iv) technical challenges of passing an additional set of fields through the MPI capability (if this needs to be maintained). There is also a risk that (since we would only ever map variables that are truly needed between the 3 and 17 tile versions) that unassigned variables get inadvertently used in future developments (the resulting code will be quite obscure)

Option 2 would look something like

USE cable_types, ONLY :: veg
USE casa_types, ONLY :: casapool, casabiome, casaflux
USE POP_types, ONLY :: POP
USE POPLUC_types, ONLY :: POPLUC, LUC_EXPT
REAL, DIMENSION(:), ALLOCATABLE :: temporary_vars
initialise_cable(mp(ntile=x),veg,...) then initialise_casa(mp(ntile=17),casabiome,casapool,casaflux,...)
initialise_POP(mp(ntile=3),POP) , initialise_POPLUC(mp(ntile=3),POPLUC), initialise_LUC_EXPT(mp(ntile=3),LUC_EXPT)
allocate_tempvars(temporary_vars,x)
CALL mapCASAtoPOP(mp(nitle=x),mp(ntile=3),veg,casabiome,casapool,casaflux,temporary_vars)
CALL SUBROUTINE LUCdriver( POP, LUC_EXPT, POPLUC, 7 temporary vars)
CALL mapPOPtoCASA(mp(nitle=x),mp(ntile=3),veg,casabiome,casapool,casaflux,temporary_vars)

The disadvantages of this approach are that i) the code within the LUCdriver would be changed (backwards compatibility?), ii) the number of mapCASAtoPOP() routines could end up being quite noticeable, and iii) that we could end up having to pass additional arguments through lots of layers of the code (if it turns out that it's important when in the cycle the temporary vars are set).

The advantages of this approach are i) efficiency, ii) identical CALLs across use cases, iii) this can sit on the worker side of the MPI code and iv) clarity of code.

Note that for a TRENDY run where x=3 by default the mapCASAtoPOP routines would simply assign the appropriate field from the TYPEs into the temporary_vars without any additional manipulation/model code

har917 commented 2 months ago

Assuming that we aim for option 2 above then we would look to ALLOCATE the TYPES (in ACCESS at least) according the primary model dimensions, i.e.

CABLE with (up to) 17/27 tiles per grid cell - met%, air%, rough%, canopy%, ssnow% and climate%
CASA with (uo to) 17/27 tiles per grid cell - casapool%, casaflux%, sum_flux%, casabiome%, phen%, casabal% etc.
POP with up to 3 tiles per grid cell - POP%, POPLUC%, LUC_EXPT%
BLAZE on the grid cell - BLAZE%, SIMFIRE%

The offline initialisation code could be awkward since the TYPEs are interwoven (see load_params)

BLAZE science code will be awkward since the TYPEs are interwoven - especially adjust_pop_for_fire()

CASAONLY_LUC and spincasacnp will need CASAtoPOP interface routines included within then.

har917 commented 2 months ago

There are 6 (7) subroutines [outside of BLAZE] which option 2 would require the creation/assignment of temporary_vars under option 2

biogeochem() - look to remove POP from the argument list - may require additional science code to evaluate the mortality terms in cable_driver/cable_mpidrv
bcgdriver() - follows biogeochem()
POPio() - look to remove casamet
POPdriver() - look to remove casaflux and casabal
LUCdriver() - look to remove casabiome, casapool, casaflux and veg
POP_LUC_CASA_transfer() - look to remove casapool, casabal, casaflux - this involves at least 24 CASA fields
POP_LUC_init() - look to remove casapool, casaflux, casabiome and veg

CABLE-LSM / CABLE

CASA-POP interface should only pass information that is needed #382