AusClimateService / unseen

Code for UNprecedented Simulated Extremes using ENsembles (UNSEEN) analysis.
GNU General Public License v3.0
2 stars 4 forks source link

Add regridding option to bias correction for comparing data on different grids #72

Open stellema opened 1 week ago

stellema commented 1 week ago

The bias correction methods won't work if the forecast and observational datasets have different coordinates. Perhaps we can add an option to regrid da_obs or da_clim to the same grid as da_fcst (using xemf.Regridder) before calculating the bias?

DamienIrving commented 1 week ago

Good idea.

Here's a snippet of code I use to do this in my quantile scaling work (for that we use bilinear regridding if going from coarse to fine resolution and conservative if we're going from fine to coarse - we'll be doing the latter in this case).

import xesmf as xe

def regrid(ds, ds_grid, variable=None, method='conservative'):
    """Regrid data

    Parameters
    ----------
    ds : xarray Dataset
        Dataset to be regridded
    ds_grid : xarray Dataset
        Dataset containing target horizontal grid
    variable : str, optional
        Variable to restore attributes for
    method : str, default bilinear
        Method for regridding

    Returns
    -------
    ds : xarray Dataset

    """

    global_attrs = ds.attrs
    if variable:
        var_attrs = ds[variable].attrs        
    regridder = xe.Regridder(ds, ds_grid, method)
    ds = regridder(ds)
    ds.attrs = global_attrs
    if variable:
        ds[variable].attrs = var_attrs

    return ds
stellema commented 1 week ago

Thanks! I can add a regrid function to general_utils and call it in bias_correction.get_bias. Do you think we should add arguments to the parser that let the user decide if either the obs or model data should be regridded and if the data should be regridded before or after calculating the climatology? For example:

    parser.add_argument(
        "--regrid",
        choices=(False, "obs", "fcst"),
        default=False,
        help="Regrid observational or forecast data [default=False]",
    )
    parser.add_argument(
        "--regrid_method",
        choices=("conservative", "bilinear", "nearest_s2d", "nearest_d2s"),
        default="conservative",
        help="Regriding method for observational/forecast data [default=conservative]",
    )
    parser.add_argument(
        "--regrid_timing",
        choices=("start", "end"),
        default="end",
        help="Timing of data regridding (before or after calculating climatology) [default=end]",
    )
DamienIrving commented 1 week ago

I'd possibly put obs as default for --regrid (i.e. leave the bias corrected data on the model grid by default). Instead of having False as an option, maybe the code should just figure out if there's a lat and or lon coordinate and if those coordinates differ between the obs and fcst data. If that's the case then regridding is needed (according to --regrid), whereas if it's not the case no regridding is performed.

    parser.add_argument(
        "--regrid",
        choices=("obs", "fcst"),
        default="obs",
        help="Regrid observational or forecast data if they are on different grids [default=obs]",
    )

I think having obs and conservative as default is good because the obs will very likely be higher resolution than the model and the default that people tend to use is conservative when regridding to a lower resolution grid and bilinear when going to a higher resolution grid.

I'm not sure that we need the --regrid_timing option? I think we just make a decision (I like your default of end) and stick with that, unless we think at some point we'll want to do it both ways and compare?