Open markpayneatwork opened 6 months ago
This commit https://github.com/Klimaatlas/KAPy/commit/d6380d6c95b1fa17ddce5461f2c3f427d8590464 restructures the import functionality to avoid xarray automatically using dask. At this point, KAPy is therefore an in-memory processing tool. We may need to fix this in the future.
dask and lazy loading are also very closely coupled - for some datasets, it may be advantageous not to write out the intermediate file, but instead just return the xarray object. How this works best with cdo sellonlatbox subsetting is unclear, but the two things need to be thought together at the same time
Saving pickles is the key first step and is now working as of a301785. This achieves 90% of the desired functionality. The remaining 10% will take 90% of the work, and can be handled at a later time :-)
Is your feature request related to a problem? Please describe. KAPy is built upon xarray and xarray supports dask for parallel reading, processing and writing of data. It's a great tool and potentiallycan give substantial speed improvements
Describe the solution you'd like Provide support for dask in KAPy, either by default or via a switch
Describe alternatives you've considered There are some issues to consider