NCAR / CUPiD

CUPiD is a “one stop shop” that enables and integrates timeseries file generation, data standardization, diagnostics, and metrics from all CESM components.
https://ncar.github.io/CUPiD/
Apache License 2.0
25 stars 24 forks source link

Key metrics #118

Closed mnlevy1981 closed 4 months ago

mnlevy1981 commented 4 months ago

Put together a "key metrics" example that can be run quickly for a single CESM case to look at a small number of important diagnostics. The first pass at this adds a land-ice notebook, but more notebooks are coming.

All Submissions:

New Feature Submissions:

  1. [x] Does your submission pass tests?
  2. [x] Have you lint your code locally prior to submission?

Changes to Core Features:

TeaganKing commented 4 months ago

@mnlevy1981 would it be helpful for me to review and then bring in the land-ice notebook as is, or should we wait until additional notebooks are ready to bring in?

mnlevy1981 commented 4 months ago

I have a few more changes to make for this one (pulling several functions out into a utils.py module and some other clean-up), then we should get feedback from @Katetc and @gunterl . I'm also tempted to move from netCDF4 calls to xarray to be more in line with other CUPiD notebooks.

One other thing to note: this notebook currently reads output from coupler history files, but I think Gunter said that work is in progress to get those fields into the CISM history files -- depending on the timeline for that, we should either pull in the notebook after it's cleaned up or wait until we can update the location of the history files to read.

mnlevy1981 commented 4 months ago

This is ready for review, though I think utils.py probably needs a bit of work. Do we want to move plotting routines to a new module and keep utils.py for computational routines? Do we want to reorganize utils.py so the routines called from the notebook are at the top? As far as output goes, the map looks like

SMB vs obs

and the time series looks like

SMB time series

Note that I added the years included in the climotology to the map title and the legend; to test the "compare against a base case" (which is turned off by default), I compared years 62-101 to years 43-82 of the same run and realized it was useful to have different labels for those different plots.

mnlevy1981 commented 4 months ago

(sorry for the assigning / unassigning, I thought I was requesting reviews)

TeaganKing commented 4 months ago

Thanks for all your work on this @mnlevy1981 ! I'm aiming to look at it tomorrow!

TeaganKing commented 4 months ago

A few comments:

mnlevy1981 commented 4 months ago
  • Should these utils all necessarily be in the glc directory or is another more general location appropriate?

I think we should keep them in glc until we have an example from another component that needs the same functions (or a generalized form that can be used both for land ice and the second component with different arguments). I'm not sure where the common utilities would go, though... my first thought was cupid/ but that directory is for cupid-dev rather than cupid-analysis.

  • As we discussed today, converting netcdf4 --> xarray calls could also be useful.

I'll work on this, and will probably end up with some of the functions we talked about (like a cupid_open_dataset that wraps xr.open_dataset or xr.open_mfdatasets). At which point the first bullet is worth revisiting, because that certainly wouldn't belong in the glc/ directory :)

  • I can do some more thorough testing later but figured it may be worthwhile to make these larger comments first

Thanks! Verifying you can run the notebook on my branch would be great, but given the changes coming to address your first two points I don't think you need to dive too deep at this point

mnlevy1981 commented 4 months ago

As of 5b6e3d5 I think this is really ready for review. I've gotten rid of the netcdf4 imports - we use xarray to open the datasets and then da.sum() and da.mean() instead of their numpy equivalents.

mnlevy1981 commented 4 months ago

One other somewhat odd feature is that the output is located in "CUPiD/examples/key_metrics/computed_notebooks/key_metrics"-- is there a reason for this second 'key_metrics' directory?

I thought we had an issue ticket about this, but I can't find it. I'd really like to clean up the names of the directories cupid-run creates -- maybe CUPiD_out/ with computed_notebooks/, computed_scripts/ (once we finish #93), and html/ (created by cupid-build rather than cupid-run).