Closed shoyer closed 1 year ago
As an alternative, we could instead perhaps use make_template() internally inside ChunksToZarr.
As an alternative, we could instead perhaps use make_template() internally inside ChunksToZarr.
I implemented this in https://github.com/google/xarray-beam/pull/62
Currently we support passing an
xarray.Dataset
full of chunked dask.array objects astemplate
intoChunksToZarr
.This is convenient in simple cases, but makes it easy to write pipelines that are super slow to setup, if you pass in a chunked Dataset with many small chunks (e.g., the default output of
xarray.open_zarr()
).The breaking change here would be to require that the
template
argument was created viamake_template()
, by checking that each dask.array argument in the supplied Dataset only consists of a single chunk. We would also makezarr_chunks
required when supplying atemplate
, because it makes no sense to copy chunks from a template if usingmake_template
.