xarray-contrib / xskillscore

Metrics for verifying forecasts
https://xskillscore.readthedocs.io/en/stable/
Apache License 2.0
218 stars 36 forks source link

Example for the docs #217

Open bradyrx opened 3 years ago

bradyrx commented 3 years ago

I was helping out a labmate recently and thought something like this would be a good example for the docs. I was showing her how to use xskillscore to sift through a lot of ensemble members and pull out automatically a few members that do a very good job and very poor job at replicating ENSO from observations. This beats out the standard practice of doing it manually, of course.

We can adapt the following demo using real data. Could just perturb it or could pull down just the tropical Pacific for CESM-LE vs. observations.

The idea here was to show how you could use pattern correlations and RMSE and use xarray to select the single best/worst performers or sort and select a range so you could plot a few to eyeball from there.

import xskillscore as xs
import xarray as xr
import numpy as np

np.random.seed(333)
reference_data = np.random.rand(5, 5)
ds1 = xr.DataArray(reference_data, dims=['x', 'y'])

ensemble_data = np.random.rand(5, 5, 10)
# way off the mark.
ensemble_data[:, :, 9] = ensemble_data[:, :, 9] + 5
# perfect fit
ensemble_data[:, :, 0] = reference_data

ds2 = xr.DataArray(ensemble_data, dims=['x', 'y', 'member'])
ds2['member'] = np.arange(ds2.member.size) + 1

# assess performance
patterncorr = xs.pearson_r(ds1, ds2, dim=['x', 'y'], skipna=True)
rmse = xs.rmse(ds1, ds2, dim=['x', 'y'], skipna=True)

# SELECTING MEMBERS
# 1. just find the single best/worst.
# argmin/argmax give 0-based indices so use .isel()
ds2.isel(member=patterncorr.argmin())
ds2.isel(member=rmse.argmax())
ds2.isel(member=patterncorr.argmax())
ds2.isel(member=rmse.argmin())

# 2. select a few from top and bottom.
# sorting by the values in this array.
# will shift the member coordinate label so we can use that.
best_to_worst = rmse.sortby(rmse)
best_members = best_to_worst.isel(member=slice(0, 3)).member
worst_members = best_to_worst.isel(member=slice(7, None)).member
# note we use `.sel()` here since we are referencing the actual
# member label
ds2.sel(member=best_members)
raybellwaves commented 3 years ago

Cool.

Any thoughts on how to structure the notebooks? I'm thinking

GETTING STARTED

Quick Start | - quick-start.ipynb Geophysical data | - your example above (could replace x and y with lon and lat) Tabular data | - pandas-dataframes.ipynb (https://github.com/raybellwaves/xskillscore-tutorial/issues/3)

We can then add to the Geophysical data and Tabular data sections over time.

bradyrx commented 3 years ago

I think with regards to quick-start we should totally reformat it. As it stands, it serves as a more detailed API for us. There's nothing wrong in the present framework, but it's sort of an API dump. As we add more notebooks, the quick start should be true to its name and just show how to load in some xarray data and apply a few functions to it.

See our quick start here: https://climpred.readthedocs.io/en/stable/quick-start.html and any other quick starts from python packages. Maybe just a couple deterministic, probabilistic, etc. with quick viz from xarray.

Geophysical data sounds good with the above idea I had. As with your tabular data.

Then maybe like Forecast Data. We could move stuff from the current quick start to there on @dougiesquire's module. Basically we could go through the current quick start and find some spots in there that deserve their own notebooks.

Then perhaps a page on statistical testing.

aaronspring commented 3 years ago

I dont see the current quick-start as an API dump. Sure, we use synthetical data, and every user wants a real world example (from their field). we show how the API of our functions work, not more, not less.

on statistical testing: we dont have many functions here. and the ones we have behave like the metrics. if we provide more examples it looks like this package is on statistical testing. IMO its not. its about forecasting metrics and the tests are using the metrics, but are not about the general framework of hypothesis testing