pangeo-data / climpred

:earth_americas: Verification of weather and climate forecasts :earth_africa:
https://climpred.readthedocs.io
MIT License
227 stars 49 forks source link

Spatial and temporal smoothing #76

Closed aaronspring closed 4 years ago

aaronspring commented 5 years ago

To be added to predictionensemble class. Spatial smoothing, eg. 5x5degree grid Temporal smoothing, eg. 4yr means applied to reference and forecast as in Goddard 2013

bradyrx commented 5 years ago

I like this. One hope for the predictionEnsemble class eventually is that you can treat it like a comprehensive xarray object. So that if you did predictionEnsemble.rename({'tos': 'SST'}) we would loop through every single object attached to our class and rename tos for all of them. Or in the case of this issue, predictionEnsemble.smooth({'time': 4}) or some syntax, would loop through and smooth all associated objects the same way.

aaronspring commented 5 years ago

quick example for temporal

def temporal_smoothing(ds, smooth=4):
    ds_smoothed = ds.rolling(time=smooth, center=False).mean().dropna('time')
    new_time = [str(t-smooth+1)+'-'+str(t) for t in ds_smoothed.time.values]
    ds_smoothed['time']=new_time
    return ds_smoothed

spatial will be more difficult. I try to aviod regridding but I guess we will end up using a regridding method.

aaronspring commented 5 years ago

the temporal function create 4-yr averages as in Goddard and Li. now spatial. have to install xesmf

conda install -c conda-forge esmpy
conda install -c conda-forge krb5
pip install xesmf

def spatial_smoothing(ds, boxsize=(5,5)):
    if isinstance(ds, xr.Dataset):
        ds = ds.to_array().squeeze()
    ds_out = xe.util.grid_global(*boxsize)
    regridder = xe.Regridder(ds, ds_out, 'bilinear', reuse_weights=True)
    return regridder(ds)

it works even on multi-dim data (more than 4). `cdo` couldnt. 
aaronspring commented 5 years ago

do we start a utils.py file with the content of loadutils and this:

def temporal_smoothing(ds, smooth=4):
    """Apply temporal smoothing by creating rolling smooth-timestep means.

    Reference:
      * Goddard, L., A. Kumar, A. Solomon, D. Smith, G. Boer, P.
            Gonzalez, V. Kharin, et al. “A Verification Framework for
            Interannual-to-Decadal Predictions Experiments.” Climate
            Dynamics 40, no. 1–2 (January 1, 2013): 245–72.
            https://doi.org/10/f4jjvf.

    Args:
        ds (xr.object): input.
        smooth (int): length of smoothing of timesteps.
                      Defaults to 4 (see Goddard et al. 2013).

    Returns:
        ds_smoothed (xr.object): input with `smooth` timesteps less and
                                 labeling '1-(smooth-1)','...', ... .

    """
    ds_smoothed = ds.rolling(time=smooth, center=False).mean().dropna('time')
    new_time = [
        str(t - smooth + 1) + '-' + str(t) for t in ds_smoothed.time.values
    ]
    ds_smoothed['time'] = new_time
    return ds_smoothed

def spatial_smoothing(ds, boxsize=(5, 5)):
    """Apply spatial smoothing by regridding to `boxsize` grid.

    Reference:
      * Goddard, L., A. Kumar, A. Solomon, D. Smith, G. Boer, P.
            Gonzalez, V. Kharin, et al. “A Verification Framework for
            Interannual-to-Decadal Predictions Experiments.” Climate
            Dynamics 40, no. 1–2 (January 1, 2013): 245–72.
            https://doi.org/10/f4jjvf.

    Args:
        ds (xr.object): input. xr.DataArray prefered.
        boxsize (tuple): target grid.
                         Defaults to `(5, 5)` (see Goddard et al. 2013).

    Returns:
        ds_smoothed (xr.object): boxsize-regridded input

    """
    if isinstance(ds, xr.Dataset):
        ds = ds.to_array().squeeze()
    ds_out = xe.util.grid_global(*boxsize)
    regridder = xe.Regridder(ds, ds_out, 'bilinear', reuse_weights=True)
    return regridder(ds)
aaronspring commented 5 years ago

temporal smoothing would probably fail now.

Those functions could go into loadutils, couldn’t they?

ahuang11 commented 5 years ago

Spatial smoothing can also be done with the coarsen method if xesmf is not installed (because it's a pretty big package to install without conda)

bradyrx commented 5 years ago

I'm putting this under the v2 tag since we decided for v1 users could pre-process on their own. When we introduce the accessor format for our objects, we can use .coarsen().

loadutils is just for loading sample data. It should really be changed to tutorial. I don't like the current name.

aaronspring commented 5 years ago

with coarsen is will be easier to implement. the spatial smoothing with xesmf would allow regridding to a 5x5 grid, what is often done. see Goddard et al. 2013 and MurCCS.

bradyrx commented 5 years ago

That sounds good. I think it would be nice to eventually have these methods. For now we can just expect the user to handle regridding/resampling.

aaronspring commented 5 years ago

I think I will soon implement these functions. However, I am unsure how to add this into classes.

my initial idea would be to add to class PredictionEnsemble a function called smooth, so it applied directly to both subclasses and applies the function to both objects. the args of smooth would then determine whether spatial or temporal smoothing is done. and smooth('Goddard2013') could be a wrapper.

does this sound good as a proposed design?

bradyrx commented 5 years ago

my initial idea would be to add to class PredictionEnsemble a function called smooth, so it applied directly to both subclasses and applies the function to both objects.

Yes, since it should be used in the same way by both hindcast and perfect model ensembles, it should be added to PredictionEnsemble to then be inherited by the other two.

bradyrx commented 5 years ago

You'll have to do some looping through the dictionary to apply it to all object (the initialized dataset + all products associated with it).

There might be a way to add a generalized function called like apply_func() that takes a function as an argument that loops through the full dataset and applies it everywhere. Then maybe we could also pass native xarray functions into that. Or maybe this is all made easier with accessors. I'm not sure, but it's an important infrastructure thing to work out (cc @ahuang11).