E3SM-Project / zppy

E3SM post-processing toolchain
BSD 3-Clause "New" or "Revised" License
6 stars 13 forks source link

Add BGC global mean analysis #365

Open forsyth2 opened 1 year ago

forsyth2 commented 1 year ago

Add BGC global mean analysis.

See BGC requests listed on https://acme-climate.atlassian.net/wiki/spaces/EIDMG/pages/3476979729/zppy+New+Feature+Requests.

This will support #151.

dmricciuto commented 1 year ago

Instructions for the utility I have been using for BGC land simulations is posted in a comment at the bottom of this BGC meeting notes page, if it is helpful for your development. https://acme-climate.atlassian.net/wiki/spaces/EBGC/pages/3574006104/2022-11-21+11-28+12-5+Land+BGC+Spinup+Meeting+notes

forsyth2 commented 1 year ago

@BunnyVon @chengzhuzhang This issue was created in response to https://acme-climate.atlassian.net/browse/EB-180, which mentions putting together "a list of land variables to be included in zppy global mean time series". I'm interpreting that to mean adding the variables listed on https://acme-climate.atlassian.net/wiki/spaces/EIDMG/pages/3476979729/zppy+New+Feature+Requests. That is, the following:

Is that correct? That would entail adding all of the above as possible options in global_time_series (https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/default.ini#L279), each following the instructions at https://e3sm-project.github.io/zppy/_build/html/main/dev_guide/new_glb_plot.html.

I know we've had some discussions about area_mean_time_series (part of E3SM Diags) versus global_time_series (directly part of zppy) -- e.g., #312. I want to say it would be more straightforward adding all those options to the area_mean_time_series, but I'm unsure exactly what is needed.

chengzhuzhang commented 1 year ago

@forsyth2 My take is that this task is to generate global time series figures as overview plots. And there was requirement to archive the intermediate files that create these figures. Therefore building this into the zppy global_time_series seems to be a better fit.

We also talked about the a BGC specific .cfg . So to specify the variables in the configuration file rather than in default.ini. Based on the instruction to add new plots, it requires changes in base code, which means that the same coupled_global.py script could break for other runs? I think a more flexible way is to have a new parameter that can be defined by users for variables to be extracted for global time series plot? Looking at the instruction, the PLOT_DICT could be made dynamically based on the variables, so that a same coupled_global.py script can be used for different campaigns? Alternatively, it might more straightfoward to have a bgc_global.py script to begin with...

forsyth2 commented 1 year ago

And there was requirement to archive the intermediate files that create these figures.

Ah, yes, that's correct. We wanted to include that.

We also talked about the a BGC specific .cfg . So to specify the variables in the configuration file rather than in default.ini.

That will be #151, which will create a campaign file (as water_cycle and cryosphere already have). That is, #151 will create a list of defaults for Land/BGC to use. This issue (#365) is specifically to have the capability to generate the necessary plots.

I think a more flexible way is to have a new parameter that can be defined by users for variables to be extracted for global time series plot?

Yes, that is what is currently possible, with our limited number of available plots:

plot_names = string(default="net_toa_flux_restom,global_surface_air_temperature,toa_radiation,net_atm_energy_imbalance,change_ohc,max_moc,change_sea_level,net_atm_water_imbalance")

(https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/default.ini#L279)

The problem is that each plot has its own function in coupled_global.py (the functions are what PLOT_DICT references). I can make an issue to try to parse out common code between the functions, but ultimately a number of parameters would need to be included for every single plot (e.g., plot name, units) somehow. Options:

  1. Add a number of parameters to default.ini. Maybe something like parallel lists:
    plot_names = "plot_1_name", "plot_2_name", ...
    plot_units = "plot_1_units", plot_2_units", ...
    ...

    Keep in mind, if we include all the "Production" variables above, that would be ~57 plots.

  2. Have all the necessary features of each plot hard-coded somewhere. This is currently being done with a separate function for each plot. As mentioned above, it may be possible to have one function taking a number of variables from a hard-coded dictionary.
  3. Somehow generate all the necessary info to put on the plot from variable name alone. I suppose E3SM Diags does that, or no? In global_time_series, all the plot units for instance are hard-coded in https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/coupled_global.py

more straightfoward to have a bgc_global.py script to begin with

I still don't think that's the right approach, since it would mean maintaining two scripts that are ultimately very similar in their goals.

BunnyVon commented 1 year ago

@forsyth2 , it is my original list. I'm adding @susburrows @beharrop @dmricciuto @acme-y9s @jenniferholm who play important roles in either BGC v1 or v2 simulation campaigns and see if they have any input re: global mean time series

  • Spin-up

    • Atmosphere (TS,FSNT,FLNT,RESTOM,CO2,CO2_FFF,CO2_OCN,CO2_LND,TMCO2, TMCO2_FFF, TMCO2_LND, TMCO2_OCN)
    • Land (TOTSOMC, TOTECOSYSC, NEE, TOTVEGC)
  • Production

    • Atmosphere (AODVIS, FLNS, FSNTOA, PRECSL, TGCLDLWP, TS, CCN3, FLNT, LINOZ_O3COL, QFLX, TMCO2, U10, CLDHGH, FLUT, LWCF, RESTOM, TMCO2, TMCO2_FFF, U850, CLDLOW, FSNS, PRECC, RHREFHT, TMCO2_LND, CLDMED, FSNT, PRECL, SHFLX, TMCO2_OCN, CO2, CO2_FFF,CO2_OCN,CO2_LND, FSNTC, PRECSC, SWCF, TREFHT)
    • Land (NBP,PCO2,GPP,NPP,NEE,TOTECOSYSC,TOTVEGC,TOTSOMC,TOTSOMN,TOTSOMP,TOTLITC,CPOOL, FAREA_BURNED, NFIRES, FPI, FP_UPTAKE, PLANT_NDEMAND_COL, PLANT_PDEMAND_COL, SMINN_TO_PLANT, SMINP_TO_PLANT)
forsyth2 commented 1 year ago

it may be possible to have one function taking a number of variables from a hard-coded dictionary.

I started working on this in #389. I think this refactoring is a good idea since it will make it easier to add new plots.

That said, the existing 8 plots have substantial differences. This results in a lot of customization to the common function. As mentioned earlier, each plot will need to have a number of parameters associated with it. For example, the (not-completed) parameter dictionary for this first plot is:

    param_dict = {
        "check_exp_ocean": False,
        "check_exp_year": True,
        "default_ylim": [-1.5, 1.5],
        "do_add_line": True,
        "do_add_trend": True,
        "lw": 1.0,
        "set_axhline": True,
        "set_legend": True,
        "shorten_year": False,
        "title": "Net TOA flux (restom)",
        "var": np.array(exp["annual"]["RESTOM"]),
        "ylabel:": "W m-2",
        }

That's 11 additional pieces of information beyond just the variable. That is, all this information would need to be specified for the new 50+ variables for land; however, if they all have exactly the same plot structure, we may be able to streamline that a bit.