Ouranosinc / xclim

Library of derived climate variables, ie climate indicators, based on xarray.
https://xclim.readthedocs.io/en/stable/
Apache License 2.0
333 stars 59 forks source link

Using sdba.processing.to_additive_space for precipitation with units of mm/day #1478

Closed isaachcamp closed 1 year ago

isaachcamp commented 1 year ago

Setup Information

Description

I'm trying to use the sdba.processing.to_additive_space with precipitation data (units 'mm/day') and a lower bound of 0., but I'm getting this error:

DimensionalityError: Cannot convert from 'degree_Celsius' ([temperature]) to 'millimeter / day' ([length] / [time])

I believe it is being raised when attempting to convert the lower bound units. I have tried the same with temperature (units 'K') and it works fine. If I change the units of the precipitation data to 'K' it will run, but returns NaNs (I assume because of some unit conversion behind the scenes).

I'm using a DataArray with time series data and only one grid cell, the time coordinate is np.datetime64.

Steps To Reproduce

import xarray as xr from xclim import sdba from pathlib import Path

fpath = Path(f"/nesi/project/niwa00018/campbelli/statistical-downscaling/data/vcsn/VCSN_daily_pr5k_1972-2020.nc") dset = xr.open_dataset(fpath).squeeze()["pr"].sel(latitude=-41.322, longitude=174.804, method="nearest") dset.attrs.update({"units": "mm/day"}) sdba.processing.to_additive_space(dset, lower_bound=0)

Traceback

/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/xclim/core/formatting.py:397: FutureWarning: Future versions of xclim will require explicit unit specifications. outs = func(*args, kwargs) Traceback (most recent call last): File "", line 1, in File "", line 2, in to_additive_space File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/xclim/core/formatting.py", line 397, in _call_and_add_history outs = func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/xclim/sdba/processing.py", line 678, in to_additive_space lower_bound = convert_units_to(lower_bound, data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/xclim/core/units.py", line 424, in convert_units_to return units.Quantity(source, units=source_unit).to(target_unit).m ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/pint/facets/plain/quantity.py", line 517, in to magnitude = self._convert_magnitude_not_inplace(other, contexts, ctx_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/pint/facets/plain/quantity.py", line 462, in _convert_magnitude_not_inplace return self._REGISTRY.convert(self._magnitude, self._units, other) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/pint/facets/plain/registry.py", line 961, in convert return self._convert(value, src, dst, inplace) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/pint/facets/context/registry.py", line 403, in _convert return super()._convert(value, src, dst, inplace) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/campbelli/.conda/envs/bcsd/lib/python3.11/site-packages/pint/facets/nonmultiplicative/registry.py", line 262, in _convert raise DimensionalityError(src, dst, src_dim, dst_dim) pint.errors.DimensionalityError: Cannot convert from 'degree_Celsius' ([temperature]) to 'millimeter / day' ([length] / [time])

Additional context

No response

Contribution

Code of Conduct

coxipi commented 1 year ago

Hi @isaachcamp,

Thanks for the details. Indeed, the problem stems from unit conversion failing. Specifically, to_additive_space expects lower_bound as a str which contains the units :

lower_bound : str The smallest physical value of the variable, excluded, as a Quantity The data should only have values strictly larger than this bound.

so, you would need to call your function like this:

sdba.processing.to_additive_space(dset, lower_bound="0 mm/day")

and things will work out as expected. With this kind of argument, you just need to have the same dimensionality as the argument you're comparing with (here dset), so "1 mm/day" or "0.001 m/day" would be both be acceptable choices for instance. By the way, if you have other questions surrounding this topic, there's a section about this in this tutorial notebook.

Changes to make in xclim

  1. Here, convert_units_to(lower_bound, data) fails. There are many checks on the type of lower_bound (or source). In your case, your call with lower_bound=0 should have been caught by :

    if isinstance(source, (float, int)):
    raise TypeError("Please specify units explicitly.")

    but it seems to have evaded it. Defining the function in a notebook, I do get the expected TypeError:

    xclim.core.units.convert_units_to(0, "mm/d") # I get the "Cannot convert from 'degree_Celsius' 
    convert_units_to_defined_in_a_jupyter(0, "mm/d")  # I get the expected TypeError message

    I will investigate. Also, maybe the TypeError should be more explicit: Saying explicitly that we want a Quantified or str would be good.

  2. Your observation about temperatures is also interesting. I can reproduce this directly with convert_units_to

    from xclim.core.units import convert_units_to
    convert_units_to(0, "mm/d") # this fails
    convert_units_to(0, "K") # this succeeds

    This will also probably be solved the first point above, when we fix the failing/succeeding conditions leading to the TypeError above, but I will be careful with this point. Perhaps this can have other implications elsewhere.

aulemahal commented 1 year ago

@isaachcamp Sorry for this issue, the confusion comes from a legacy code block that we had forgotten in the unit conversion function. Up to version 0.44, it would guess temperature units whenever the function received a number. At the very beginning of xclim this idea was kinda useful in some cases, but in the latter versions we began writing functions like to_additive_space where it simply makes no sense. But we forgot the code block there. It was removed very recently and a proper error should be raised with xclim 0.45. (That might be the source of your confusion @coxipi)

Indeed, the better way to pass any kind of quantity with a magnitude and units, is to pass a string. You should be able to also pass a pint.Quantity object, but that has been less tested.

coxipi commented 1 year ago

@aulemahal Ah got it. I simply "mamba installed" xclim in a new environment, thought it would 0.45, didn't realize it was 0.44.X version. The successful code I tested in a notebook was from 0.45.X.

I had checked CHANGES.rst and didn't see changes to the function, I should learn how to compare code versions in GitHub, it would be more efficient haha!

isaachcamp commented 1 year ago

Hi @coxipi and @aulemahal, thanks for being so quick to respond!

That works using lower_bound="0 mm/d", and just to check I updated to v0.45 and I get the TypeError. Making the error message more explicit could be good. You could perhaps add that it's the lower_bound argument that requires explicit units and provide an example, like "e.g., '0 mm/d'", but as it is in v0.45 I would have rechecked the docs and found the problem, so it's not essential. I wonder though if the units might be retrieved from the DataArray attributes, so they don't need to be specified for the lower bound?

Anyway, thanks for the help!