pangeo-data / rechunker

Disk-to-disk chunk transformation for chunked arrays.
https://rechunker.readthedocs.io/
MIT License
163 stars 25 forks source link

TypeError when rechunking #84

Closed jsadler2 closed 3 years ago

jsadler2 commented 3 years ago

I am trying to rechunk my dataset from {"time": 959, "lat": 112, "lon": 464"} to {'time': 61376, "lat": 28, "lon": 29} but I am getting an TypeError.

import xarray as xr
import fsspec
from rechunker import rechunk

nldas_path = 'ds-drb-data/nldas'
nldas_full = fs.get_mapper(nldas_path)
ds_full = xr.open_zarr(nldas_full)
intermediate = fs.get_mapper('ds-drb-data/nldas_intermediate')
target = fs.get_mapper('ds-drb-data/nldas_timeseries_chunks')

target_chunks = {'time': 61376, "lat": 28, "lon": 29}

rechunk_plan = rechunk(ds_full, target_chunks, max_mem='200MB', target_store=target, temp_store=intermediate)

This is the error I get when I call rechunk:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-d480b5036d1c> in <module>
----> 1 rechunk_plan = rechunk(ds_full, target_chunks, max_mem='200MB', target_store=target, temp_store=intermediate)

/srv/conda/envs/pangeo/lib/python3.7/site-packages/rechunker/api.py in rechunk(source, target_chunks, max_mem, target_store, target_options, temp_store, temp_options, executor)
    294         target_options=target_options,
    295         temp_store=temp_store,
--> 296         temp_options=temp_options,
    297     )
    298     plan = executor.prepare_plan(copy_spec)

/srv/conda/envs/pangeo/lib/python3.7/site-packages/rechunker/api.py in _setup_rechunk(source, target_chunks, max_mem, target_store, target_options, temp_store, temp_options)
    376                 temp_store_or_group=temp_group,
    377                 temp_options=options,
--> 378                 name=name,
    379             )
    380             copy_spec.write.array.attrs.update(variable_attrs)  # type: ignore

/srv/conda/envs/pangeo/lib/python3.7/site-packages/rechunker/api.py in _setup_array_rechunk(source_array, target_chunks, max_mem, target_store_or_group, target_options, temp_store_or_group, temp_options, name)
    479         itemsize,
    480         max_mem,
--> 481         consolidate_reads=consolidate_reads,
    482     )
    483 

/srv/conda/envs/pangeo/lib/python3.7/site-packages/rechunker/algorithm.py in rechunking_plan(shape, source_chunks, target_chunks, itemsize, max_mem, consolidate_reads, consolidate_writes)
    111     if len(source_chunks) != ndim:
    112         raise ValueError(f"source_chunks {source_chunks} must have length {ndim}")
--> 113     if len(target_chunks) != ndim:
    114         raise ValueError(f"target_chunks {target_chunks} must have length {ndim}")
    115 

TypeError: object of type 'int' has no len()

Any ideas? Did I do something wrong/weird?

jsadler2 commented 3 years ago

Now that I'm reading #83, it seems like I need to define my chunks differently, i.e. a dict for each variable.