Closed juseg closed 1 month ago
I can only reproduce v0.3.0
slowness by providing an explicit chunks=-1
to open_mfdataset
. I tested this on a few different versions of xarray starting with v2022.06.0
and simply omitting the chunks
argument consistently performs well. The dask
dependency is missing and that will be fixed with #76. However I can no longer reproduce this issue.
Dask can be used to read only necessary chunks on global data leading to a huge performance boost. I think this should be at least an option and probably the default behaviour if dask is installed.
Preparing a Cocuy-1km atmosphere file with CHELSA data I get a performance boost from 2m16s to 4s. The optimal chunk on my machine is a
{'y': 120'}
horizontal stripe. I wonder if performance can be improved even more by storing global data in a tiled format instead of the original striped.