Quantiles from pooled ensemble members

dmitri1357 commented 3 months ago

Addressing a Problem?

I am using xclim-sdba to bias correct 50 ensemble members of the CESM2 climate model. I am using the MBCn technique from Cannon 2018 and following the tutorial here: https://xclim.readthedocs.io/en/stable/notebooks/sdba.html#Fourth-example-:-Multivariate-bias-adjustment-with-multiple-steps-(Cannon,-2018). I would like to pool all 50 ensemble members into a single distribution from which the quantiles will be computed. However, I don't see a way of doing this. It seems like the observational and model datasets need to have the same time dimension for the code to work. So, I can bias correct each ensemble member individually without problems, but this is not always the preferred approach. I have tried to place the ensemble members into a separate dimension ('realizations') but I am still unable to perform the adjusting procedure. Could this option be added to xclim?

Potential Solution

No response

Additional context

No response

Contribution

[ ] I would be willing/able to open a Pull Request to contribute this feature.

Code of Conduct

[X] I agree to follow this project's Code of Conduct

aulemahal commented 3 months ago

Hi @dmitri1357 !

There's a way to do it, but it's not clean and I don't know if it works with MBCn, which is a way more complicated than the other base adjustment (EQM, DQM, etc).

xclim's sdba.Grouper object accepts a add_dim argument that gives extra dimensions to consider in the grouping. This is where you want to pass your "realization" dimension name so that the quantiles are computed across both the time and the realizations. However, the issue is that this was not coded as cleanly and it will fail if the added dim does not appear on the reference. So the hacky solution is to replicate your realization dimension on your reference data. In theory, the duplication of the data won't change the distribution. This is exactly issue #1739.

Code example. Assuming you have a ref DataArray with only a "time" dimension, and scenh / scens that have an extra "realization" dimension.

ref2 = ref.expand_dims(realization=scenh.realization)
# ... 
group = Grouper("time", add_dims=['realization'])

with xc.set_options(sdba_extra_output=True):
    out = sdba.adjustment.NpdfTransform.adjust(
        ref,
        scenh_std,
        scens_std,
        base=sdba.QuantileDeltaMapping,
        base_kws={"nquantiles": 20, "group": group},
        n_iter=20,  # perform 20 iteration
        n_escore=1000,  # only send 1000 points to the escore metric (it is really slow)
    )

Again, I've never tested the add_dims thing with MBCn. I'll try to tomorrow.

dmitri1357 commented 3 months ago

Hi @aulemahal, thank you for taking a look. That's good to know about the Grouper object! Yes, please let me know if this could possibly work with MBCn. No rush since I have elected to bias-correct using one ensemble member, so I am no longer trying to pool the ensemble members for this particular project. But it is something that may be necessary for later work.

Ouranosinc / xclim