xarray-contrib / xskillscore

Metrics for verifying forecasts
https://xskillscore.readthedocs.io/en/stable/
Apache License 2.0
225 stars 40 forks source link

Generic checking of function inputs #147

Open dougiesquire opened 4 years ago

dougiesquire commented 4 years ago

Some xskillscore methods require that inputs have specific properties. For example, xr_brier_score and discrimination (and other methods yet to be implemented) require that

Questions:

  1. do we want to check input properties, or is it sufficient just to be clear in the documentation?
  2. if yes, where should the functions that perform these checks live?
  3. I think the ability to operate lazily is more important than checking inputs, so checks that require data in memory (e.g. 0<=forecasts<=1) should not be run on dask-backended objects. Does this sound reasonable?
aaronspring commented 4 years ago

Can observations also be 0 or 1 instead of boolean? I think this also works.

aaronspring commented 4 years ago
  1. If 0 or 1 also works type checking doesn't bring us much. Does it? Clear documentation will do
  2. Utils or new file checks
  3. Yes
dougiesquire commented 4 years ago

Good point, 0 or 1 also works, so checking would only be run with np-backended arrays. Maybe not worth it then.

aaronspring commented 4 years ago

i tried this here: https://github.com/csiro-dcfp/doppyo/blob/98acad9fe3cf658d5b000401ba898ef343abb768/doppyo/skill.py#L256 but didnt account for if not dask.is_dask_collection()

bradyrx commented 4 years ago

I think this is a good thing to do, but I'm wondering how to do this cleanly. What comes to mind would be us creating a custom decorator that can ingest an arbitrary number of tuples like (a, ['str']) which send through the user-inputted variable and the list of appropriate variable types.

Then the decorator could loop through these and check these. Then we don't have to make a separate function for each type or have isinstance everywhere.

bradyrx commented 4 years ago

Potentially helpful:

https://stackoverflow.com/questions/15299878/how-to-use-python-decorators-to-check-function-arguments

https://www.pythonforthelab.com/blog/how-to-use-decorators-to-validate-input/

aaronspring commented 4 years ago

Good point, 0 or 1 also works, so checking would only be run with np-backended arrays. Maybe not worth it then.

@bradyrx How would the checks look like?

Maybe you write them first with all the isinstance calls. Then later we can refactor

bradyrx commented 4 years ago

I think you'd just loop through the tuples and do some tuple unpacking such that a tuple would be (x, ['float', 'int']) and the check would be:

for t in tuple_list:
    variable, types = t
    if not isinstance(variable, types):
        raise ValueError(f"{variable} not one of {types}.")

Or some variant on that.

The decorator would then be

@check_types((a, 'str'), (b, ['float', 'int']))
def func(a, b):
aaronspring commented 4 years ago

properscoring has checks https://github.com/TheClimateCorporation/properscoring/blob/1ca13dcbc1abf53d07474b74fbe3567fd4045668/properscoring/_brier.py#L41

and in xs we do a dask test https://github.com/raybellwaves/xskillscore/blob/de4b53843fbab3d841388d4fce866b81a6748476/xskillscore/tests/test_probabilistic.py#L250

So these checks trigger computation when via xr applyufunc at least @dougiesquire

raybellwaves commented 4 years ago

FWIW dask-ml has typing and uses ArrayLike = TypeVar("ArrayLike", Array, np.ndarray). Array is from dask.array import Array

https://github.com/dask/dask-ml/blob/master/dask_ml/_typing.py

But not sure if it's for type hinting or checks.

dougiesquire commented 4 years ago

So these checks trigger computation when via xr applyufunc at least @dougiesquire

That's not good - isn't this something we'd want to avoid?

aaronspring commented 4 years ago

Sorry. I meant to say these tests don't trigger computation in these tests.

aaronspring commented 2 years ago

But not sure if it's for type hinting or checks.

~~Exactly what typing ensures #317 ~~

EDIT: wrong at runtime no check