I noticed you're using concatenate2 from dask. I've found this to be quite wasteful, it's cool in its recursive ability but sucks in that it repeatedly allocates new memory.
Cubed should be using concatenate3 which allocates the final output array up top, and then assigns regions of that output, similar to np.block.
With dask at least, I've noticed that _concatenate2 takes half the time it takes to execute a reduction haha.
PS: it feels like the long-term impact of dask is to give a small number of us a common language for these internal utilities. I see your lol_product!
I noticed you're using
concatenate2
from dask. I've found this to be quite wasteful, it's cool in its recursive ability but sucks in that it repeatedly allocates new memory.Cubed should be using
concatenate3
which allocates the final output array up top, and then assigns regions of that output, similar tonp.block
.With dask at least, I've noticed that
_concatenate2
takes half the time it takes to execute a reduction haha.PS: it feels like the long-term impact of dask is to give a small number of us a common language for these internal utilities. I see your
lol_product
!