Open florianboergel opened 1 year ago
For me it improved the computation.
I'm not sure what you mean, as "calc_clim" is not delayed in my version, it's called in a loop but the delayed steps are calc_thresh, calc_seas and run_avg, as they're all called independently from each other. What dask doesn't recommend is to called a delayed function from another delayed function. So in this case if calc_clim was delayed when defined. I'm not arguing that your order of operations might have sped the computation, just it might be for other reasons, for example in your version window_roll is called inside a delayed function, so maybe delaying window_roll in the original structure might still work.
Frankly I don't remember why I didn't delayed window_roll at the time.
I will try that when I have time, frankly it's hard for me to test reliably performance as the system we use in unreliable in that sense.
Again, it must be verified, but the dask documentation says one should avoid calling multiple delayed functions. That is why the
@dask_delayed
is removed incalc_clim()
. Instead, I only apply this now forcalc_tresh
andcalc_seas()
, which I know call independently.The different structure of
results
needs to be accounted for below.To make sure I only call one delayed function I also removed the dask_delayed tag for
runavg
so that calculate_seas looks like.If you think this make sense, I can also make a pull request to verify the changes.