xarray-contrib / xskillscore

Metrics for verifying forecasts
https://xskillscore.readthedocs.io/en/stable/
Apache License 2.0
225 stars 40 forks source link

Using dask="allowed" is slightly faster #207

Open ahuang11 opened 4 years ago

ahuang11 commented 4 years ago

From: https://xarray.pydata.org/en/stable/dask.html

Tip For the majority of NumPy functions that are already wrapped by Dask, it’s usually a better idea to use the pre-existing dask.array function, by using either a pre-existing xarray methods or apply_ufunc() with dask='allowed'. Dask can often have a more efficient implementation that makes use of the specialized structure of a problem, unlike the generic speedups offered by dask='parallelized'.

So, I simply set dask="allowed"

    return xr.apply_ufunc(
        _rmse,
        a,
        b,
        weights,
        input_core_dims=input_core_dims,
        kwargs={"axis": axis, "skipna": skipna},
        dask="allowed",
        output_dtypes=[float],
        keep_attrs=keep_attrs,
    )

And a decent speedup image Almost 2x with bigger arrays! image

aaronspring commented 4 years ago

please add these tests to asv and for small 1D, and larger 3D arrays and for chunked and not chunked. otherwise the timings are less robust. would it be possible to set map_blocks or ufunc as a keyword, or even via config?

ahuang11 commented 4 years ago

Apply ufunc is definitely faster for numpy ufunc. not so much for map blocks

On Thu, Oct 8, 2020, 3:18 AM Aaron Spring notifications@github.com wrote:

please add these tests to asv and for small 1D, and larger 3D arrays and for chunked and not chunked. otherwise the timings are less robust. would it be possible to set map_blocks or ufunc as a keyword, or even via config ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/xarray-contrib/xskillscore/issues/207#issuecomment-705410635, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU7FFSAF7SYU5KHTTO72CLSJVYVPANCNFSM4SIBALSA .

aaronspring commented 4 years ago

are you sure that the second run isnt faster than the first because some parts of the data are already in memory? can reverse ordering. asv runs multiple times to reduce that influence

ahuang11 commented 4 years ago

Any ideas why asv is failing on me?

[ 70.83%] ··· ================================================ ========
                                     m
              ------------------------------------------------ --------
                     <function rmse at 0x7fe9d0353730>          failed
                   <function pearson_r at 0x7fe9d032c400>       failed
                      <function mae at 0x7fe9d0353840>          failed
                      <function mse at 0x7fe9d03537b8>          failed
               <function pearson_r_p_value at 0x7fe9d0353378>   failed
              ================================================ ========
ahuang11 commented 4 years ago

I'm pretty confident that it's faster with allowed though:

With timeit (which runs it multiple times) and the ordering swapped image

image

image

image

ahuang11 commented 4 years ago
import numpy as np
import xarray as xr
import xskillscore as xs

obs3 = xr.DataArray(
       np.random.rand(1000, 1000),
       dims=["lat", "lon"],
       name='var'
   ).chunk()
fct3 = xr.DataArray(
       np.random.rand(100, 1000, 1000),
       dims=["member", "lat", "lon"],
       name='var'
   ).chunk()

%timeit xs.threshold_brier_score(obs3, fct3, threshold=0.5, dim=[]).load()

%timeit xs.threshold_brier_score_allowed(obs3, fct3, threshold=0.5, dim=[]).load()

Probabilistic too: image

image

bradyrx commented 4 years ago

FYI I think I found with esmtools that dask='allowed' + vectorize=True does not work together. It doesn't seem to know how to vectorize when dask is allowed.

aaronspring commented 4 years ago

We don’t use factorize anyways.

bradyrx commented 4 years ago

Yep was just mentioning that here. We don't have vectorize anywhere in xskillscore so should be fine on that front. But something to keep in mind.

ahuang11 commented 4 years ago

From https://xarray-contrib.github.io/xarray-tutorial/scipy-tutorial/06_xarray_and_dask.html#

""" There are two options for the dask kwarg.

dask="allowed" Dask arrays are passed to the user function. This is a good choice if your function can handle dask arrays and won’t call compute explicitly.

dask="parallelized". This applies the user function over blocks of the dask array using dask.array.blockwise. This is useful when your function cannot handle dask arrays natively (e.g. scipy API).

Since squared_error can handle dask arrays without computing them, we specify dask="allowed". """

ahuang11 commented 4 years ago

https://github.com/pydata/xarray/discussions/4608

bradyrx commented 3 years ago

Please ping me if you're waiting for me on a comment or any thoughts regarding this. Clearing up my git notifications and finishing up dissertation writing this week. Don't want any progress impeded here!

aaronspring commented 2 years ago

I would implement this with xskillscore.set_options(xr_apply_ufunc_dask="parallelized") as default. and then users could switch to "allowed" themselves. or we implement "default" where the metrics take "allowed" if it doesnt fail else "paralellized". thoughts? @ahuang11 @raybellwaves @bradyrx @dougiesquire

defaults from https://github.com/xarray-contrib/xskillscore/issues/315#issue-881748009