Ouranosinc / xclim

Library of derived climate variables, ie climate indicators, based on xarray.
https://xclim.readthedocs.io/en/stable/
Apache License 2.0
322 stars 56 forks source link

QuantileDeltaMapping adjustment fails with seasonal grouping #1704

Closed saschahofmann closed 4 months ago

saschahofmann commented 5 months ago

Setup Information

Description

There seems to be a bug when trying to adjust a QuantileDeltaMapping with seasonal grouping. The training seems to work just fine.

See below for modified version of the example on the xclim docs

Steps To Reproduce

Create the data as in the official example

import xarray as xr
import numpy as np

t = xr.cftime_range("2000-01-01", "2030-12-31", freq="D", calendar="noleap")
ref = xr.DataArray(
    (
        -20 * np.cos(2 * np.pi * t.dayofyear / 365)
        + 2 * np.random.random_sample((t.size,))
        + 273.15
        + 0.1 * (t - t[0]).days / 365
    ),  # "warming" of 1K per decade,
    dims=("time",),
    coords={"time": t},
    attrs={"units": "K"},
)
sim = xr.DataArray(
    (
        -18 * np.cos(2 * np.pi * t.dayofyear / 365)
        + 2 * np.random.random_sample((t.size,))
        + 273.15
        + 0.11 * (t - t[0]).days / 365
    ),  # "warming" of 1.1K per decade
    dims=("time",),
    coords={"time": t},
    attrs={"units": "K"},
)

ref = ref.sel(time=slice(None, "2015-01-01"))
hist = sim.sel(time=slice(None, "2015-01-01"))

Trigger failing adjustment

from xclim import sdba
qdm = sdba.QuantileDeltaMapping.train(
                ref,
                hist,
                group='time.season'
            )
qdm.adjust(hist).compute()

Fails with this in xclim/sdba/base.py:703:

ValueError: qdm_adjust failed on block with coords : Coordinates:
  * time       (time) object 2000-01-01 00:00:00 ... 2015-01-01 00:00:00
  * quantiles  (quantiles) float64 0.025 0.075 0.125 0.175 ... 0.875 0.925 0.975
  * season     (season) object 'DJF' 'MAM' 'JJA' 'SON'.

As mentioned above something like this passes just fine (monthly grouping)


qdm = sdba.QuantileDeltaMapping.train(
                ref,
                hist,
                group='time.month'
            )
qdm.adjust(hist).compute()

Additional context

No response

Contribution

Code of Conduct

saschahofmann commented 5 months ago

Digging deeper the problem seems to happen in sdba/utils.py add_cyclic_bounds L304. When its trying to compute the difference between coordinates da.coords[att].diff(att) it doesn't work because seasons arent numeric.

saschahofmann commented 5 months ago

I had a look at what happens for .month and it looks like it adds two months 0 and 13 that are equal to 12 and 1 respectively?

What should be the expected behaviour for seasons happy to implement that, I guess we could either not perform making it cyclic or invent some string? e.g. ABC or XYZ?

aulemahal commented 5 months ago

Hi @saschahofmann ! Sorry for the delay.

Indeed, we might have completely overlooked the seasonal grouping when implementing the other parts.

I'm not sure inventing strings will completely solve this. But I would still try that first, in add_cyclic_bounds.

But if it still doesn't work, may be we must abandon the idea of having strings as coordinates... If numbers are needed, I think the Grouper object should be the one ensuring a time.season grouping always uses a numerical coordinate, even if that means overriding the xarray behaviour...

saschahofmann commented 5 months ago

Would it make sense to replace the grouped coordinates in the beginning with numbers and finally return them to the array strings of seasons after the operations have been performed?

Making it specific for the seasonal grouping wouldn't be very elegant but right now I can not come up with another case where this would happen? I think there is no grouping that e.g. returns weekday names? Alternatively, it could be something that's applied to all string groups?

aulemahal commented 5 months ago

Yes it would make sense, but I'm not sure where that would take place ? In utils.interp_in_quantiles maybe?

I don't think it makes much sense to support weekdays... Even then, these have a standard (pythonic) translation into integers, which we could use. As such, season might be the only problematic case and it is acceptable to have a specific workaround I believe.