Closed dfulu closed 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Please upload report for BASE (
main@b652aca
). Learn more about missing BASE report.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Out of curiosity: what's the specific error here? Nice catch though, and I'm always in favour of exposing more parameters.
I was getting a memory error with the previous settings. It seems like the combination of settings we used was causing the whole array to be loaded eagerly into memory
Yes, I would concur — I've faced this exact issue (chunks=None
means you don't use Dask, so you will load it eagerly as you say). Just wanted to make sure I understood!
I have started using the dataloader to train a model.
I found that without setting the
chunks="auto"
param inxr.open_dataset()
that I get a memory error when trying to load multiple zarr files.I think this may be an edge case of using open_dataset alongside
combine_nested
. Another alternative would be to replacexr.open_dataset(path, engine="zarr", chunks="auto")
withxr.open_zarr(path)
which might be cleaner.I've also exposed a couple of the dataloader settings