pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
182 stars 32 forks source link

Memory errors with xESMF 0.8, but not with 0.7.1 #353

Closed jensdebruijn closed 2 months ago

jensdebruijn commented 2 months ago

I am regridding some large datasets with chunks. On 0.7.1 this runs quite efficiently, but on versions higher than 0.8 processing becomes much slower, and runs out of memory. Does anybody have any idea why this could be the case? I am not using new features such as parallel processing (the code I use is identical).

aulemahal commented 2 months ago

I think you might be getting the same issue as in #348. In 0.8 we introduced support for arrays chunked along the spatial dimensions and this changed the default behaviour for chunking. See the PR for details, but I suggest trying to pass output_chunks=(-1, -1) to the regridder call and see if it solves the problem ?

Example:

reg = xe.Regridder(ds, new_grid, 'bilinear')
ds2 = reg(ds, output_chunks=(-1, -1))

I kinda of lost track of that PR. I will merge it soon and release a new patched version of xESMF so that the chunking behaviour returns to the same as in 0.7.1.

jensdebruijn commented 2 months ago

Cool, this fixes the issue, thanks! +1 on setting this as the default behaviour as suggested in the pull request you linked