Open ahuang11 opened 4 years ago
please add these tests to asv
and for small 1D, and larger 3D arrays and for chunked and not chunked. otherwise the timings are less robust. would it be possible to set map_blocks
or ufunc
as a keyword, or even via config
?
Apply ufunc is definitely faster for numpy ufunc. not so much for map blocks
On Thu, Oct 8, 2020, 3:18 AM Aaron Spring notifications@github.com wrote:
please add these tests to asv and for small 1D, and larger 3D arrays and for chunked and not chunked. otherwise the timings are less robust. would it be possible to set map_blocks or ufunc as a keyword, or even via config ?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/xarray-contrib/xskillscore/issues/207#issuecomment-705410635, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU7FFSAF7SYU5KHTTO72CLSJVYVPANCNFSM4SIBALSA .
are you sure that the second run isnt faster than the first because some parts of the data are already in memory? can reverse ordering. asv
runs multiple times to reduce that influence
Any ideas why asv is failing on me?
[ 70.83%] ··· ================================================ ========
m
------------------------------------------------ --------
<function rmse at 0x7fe9d0353730> failed
<function pearson_r at 0x7fe9d032c400> failed
<function mae at 0x7fe9d0353840> failed
<function mse at 0x7fe9d03537b8> failed
<function pearson_r_p_value at 0x7fe9d0353378> failed
================================================ ========
I'm pretty confident that it's faster with allowed though:
With timeit (which runs it multiple times) and the ordering swapped
import numpy as np
import xarray as xr
import xskillscore as xs
obs3 = xr.DataArray(
np.random.rand(1000, 1000),
dims=["lat", "lon"],
name='var'
).chunk()
fct3 = xr.DataArray(
np.random.rand(100, 1000, 1000),
dims=["member", "lat", "lon"],
name='var'
).chunk()
%timeit xs.threshold_brier_score(obs3, fct3, threshold=0.5, dim=[]).load()
%timeit xs.threshold_brier_score_allowed(obs3, fct3, threshold=0.5, dim=[]).load()
Probabilistic too:
FYI I think I found with esmtools
that dask='allowed'
+ vectorize=True
does not work together. It doesn't seem to know how to vectorize when dask is allowed.
We don’t use factorize anyways.
Yep was just mentioning that here. We don't have vectorize anywhere in xskillscore so should be fine on that front. But something to keep in mind.
From https://xarray-contrib.github.io/xarray-tutorial/scipy-tutorial/06_xarray_and_dask.html#
""" There are two options for the dask kwarg.
dask="allowed" Dask arrays are passed to the user function. This is a good choice if your function can handle dask arrays and won’t call compute explicitly.
dask="parallelized". This applies the user function over blocks of the dask array using dask.array.blockwise. This is useful when your function cannot handle dask arrays natively (e.g. scipy API).
Since squared_error can handle dask arrays without computing them, we specify dask="allowed". """
Please ping me if you're waiting for me on a comment or any thoughts regarding this. Clearing up my git notifications and finishing up dissertation writing this week. Don't want any progress impeded here!
I would implement this with xskillscore.set_options(xr_apply_ufunc_dask="parallelized")
as default. and then users could switch to "allowed"
themselves. or we implement "default"
where the metrics take "allowed"
if it doesnt fail else "paralellized"
. thoughts? @ahuang11 @raybellwaves @bradyrx @dougiesquire
defaults from https://github.com/xarray-contrib/xskillscore/issues/315#issue-881748009
From: https://xarray.pydata.org/en/stable/dask.html
Tip For the majority of NumPy functions that are already wrapped by Dask, it’s usually a better idea to use the pre-existing dask.array function, by using either a pre-existing xarray methods or apply_ufunc() with dask='allowed'. Dask can often have a more efficient implementation that makes use of the specialized structure of a problem, unlike the generic speedups offered by dask='parallelized'.
So, I simply set dask="allowed"
And a decent speedup Almost 2x with bigger arrays!