Open jhamman opened 3 years ago
I thought @djhoese was working hard at dask aware reprojection in pyresample?
"working hard" has mostly been in my head as I haven't had time for any of the "real" work. Luckily, I'm not the only one worried about this. @mraspaud did the work on that resample_blocks
function and it looks like it might be a game changer for some of our algorithms. The basic idea is:
I initially wasn't a fan of this strategy as it requires slicing and rechunking of the input data, but @mraspaud's experience shows that it performs much better than resampling all overlapping input chunks and then merging/reducing them later.
resampling all overlapping input chunks and then merging/reducing them later.
I think @gjoseph92 had something clever in stackstac for doing something like this and avoiding shuffling a bunch of NaNs (or other useless data) around. Can't find it now though. I might have totally misinterpreted though
I have another very hacky implementation in pyresample for the "EWA" resampling algorithm (very specific to VIIRS and MODIS instruments) where I do a dask reduction but use tuples of values between functions. If the data is destined for the output chunk then the tuple contains arrays, if not then it contains Nones. It isn't how dask intends the function to be used (array functions should return arrays), but it works for me to prevent unnecessary processing of chunks that I know would be all NaNs/fills.
@norlandrhagen and I have been experimenting with approaches for speeding up pyramid generation when using rio-xarray's
reproject
functionality. We have this rough prototype to share: