Add flexibility on predictor and target scenarios

chrisdwells commented 2 years ago

It would be useful to separate out the input scenarios used to train MESMER from those targeted for emulation in the default version. This would allow users to test the effect of varying the training dataset to target a given scenario, including targeting a scenario not contained in the training dataset. This could be done by using separate lists in the configuration file.

mathause commented 2 years ago

I am slightly confused - is it not possible to already configure the scenarios used to train mesmer?

chrisdwells commented 2 years ago

Sorry just to be clear - I mean that, in the config file the list "scenarios" is defined, then the parameters are calculated in the example script using all the scenarios in the list together, and then the emulations are created separately for each scenario, using those parameters produced from all the scenarios together.

So there isn't flexibility on which scenarios are used to generate and use the pattern - the parameters are generated using all the scenarios in the list, and then used to project each scenario individually. This might be a good default option - using all the input data to emulate each scenario in the input - but if a user wants to just project one scenario, or to test going out-of-sample - i.e. building the parameters using one scenario or set of scenarios and using these to project others outside the predictor set - the script has to be adapted. Unless I'm mistaken and there are other example scripts to use?

mathause commented 2 years ago

Thanks for the clarification - it shows that I am not actually a user of mesmer :upside_down_face: This change does seem to make sense, it may be part of the rework of the config setup (#34)

leabeusch commented 2 years ago

Hi both, I'm quickly jumping in because I've just realized I actually do have something to contribute here: some very me-specific code again (sorry about that ^^'), but maybe it's helpful to you @chrisdwells nevertheless (& to you @mathause once you start facing the config setup): in this jupyter notebook, which is part of the code I used for the GMD paper: https://github.com/MESMER-group/Beusch_et_al_GMD_2021_MAGICC-MESMER_coupling/blob/main/scripts/6_MESMER_verification_fig3_fig4_supplement.ipynb once in the cell with execution number 12 (for the default configuration of MESMER) and later in cell 15 (for the additional predictors configuration of MESMER) both under the heading "Figure S6 + S7", I do exactly that. I.e., I create different subset of predictors containing different numbers of scenarios in the training data to derive predictor-subset-specific MESMER local forced response parameters (still called local trends (lt) parameters in the code, a remnant from their original pre-GMD paper names), which I later combine with ESM-specific forced global temperature change time series of all scenarios, to see how the different subsets of predictors perform in “in-sample” and “out-of-sample” scenarios (see supporting Figs. S6 & S7 & the corresponding text in the main manuscript in the GMD paper https://doi.org/10.5194/gmd-15-2085-2022). Of course, emulations can only be created if suitable associated global forced temperature change time series exist & evaluated if also suitable spatially-resolved climate model realizations exist. Not sure how much of this needs to be directly in the config file? It probably needs at least a list somewhere about which scenarios would theoretically be possible? Should there then be a separate config file for each one of the scenario combinations I looked at in that notebook? That could quickly explode though, right? ^^'

PS: Nice to see you in a MESMER-group repository too @chrisdwells! :)

chrisdwells commented 2 years ago

Hi both,

Thanks Lea! Yes I thought it must have been done for the MAGGICC coupling to generate the analysis in that paper, so good shout to start with that. The way I approached it was to have separate "targ_scens" and "pred_scens" lists in the config file, with the pattern trained on all the scenarios in targ_scens and projecting each of the pred_scens separately. So you can just define each separately for the setup you want. For loading in gttas I just combined both lists (and removed duplicates) within the script itself since both are needed. But I don't know what "best practice" might look like!

PS: Nice to be here @leabeusch ! Still finding my way around github though...

MESMER-group / mesmer

Add flexibility on predictor and target scenarios #174