gallantlab / cottoncandy

sugar for s3
http://gallantlab.github.io/cottoncandy/
BSD 2-Clause "Simplified" License
33 stars 16 forks source link

dask api #79

Closed anwarnunez closed 2 years ago

anwarnunez commented 4 years ago

dask api changed a long time ago... lol, they're at 2.x and we're using the 0.10.x API

we need to update code for:

change must be backwards compatible. need to check whether their API has changed w.r.t. loading distributed arrays :crossed_fingers:

sasha-kap commented 2 years ago

FYI, just tried running the Dask example from README with Python 3.7 and Dask 2.12.0 (on Google Colab). The s3_response = cci.upload_dask_array('test_dim', arr, axis=-1) part worked, but the dask_object = cci.download_dask_array('test_dim') call returned:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-95b6be780348> in <module>()
----> 1 dask_object = cci.download_dask_array('test_dim')
      2 dask_object

4 frames
/usr/local/lib/python3.7/dist-packages/cottoncandy/utils.py in iremove_root(self, object_name, *args, **kwargs)
    272             object_name = object_name[1:]
    273 
--> 274         return input_function(self, object_name, *args, **kwargs)
    275     return iremove_root
    276 

/usr/local/lib/python3.7/dist-packages/cottoncandy/interfaces.py in download_dask_array(self, object_name, dask_name)
    899                 for shape, part_name in metadata['dask']}
    900 
--> 901         return da.Array(dask, dask_name, chunks, shape = shape, dtype = dtype)
    902 
    903     @clean_object_name

/usr/local/lib/python3.7/dist-packages/dask/array/core.py in __new__(cls, dask, name, chunks, dtype, meta, shape)
   1054         else:
   1055             dt = None
-> 1056         self._chunks = normalize_chunks(chunks, shape, dtype=dt)
   1057         if self._chunks is None:
   1058             raise ValueError(CHUNKS_NONE_ERROR_MESSAGE)

/usr/local/lib/python3.7/dist-packages/dask/array/core.py in normalize_chunks(chunks, shape, limit, dtype, previous_chunks)
   2475 
   2476     if shape is not None:
-> 2477         chunks = tuple(c if c not in {None, -1} else s for c, s in zip(chunks, shape))
   2478 
   2479     if chunks and shape is not None:

/usr/local/lib/python3.7/dist-packages/dask/array/core.py in <genexpr>(.0)
   2475 
   2476     if shape is not None:
-> 2477         chunks = tuple(c if c not in {None, -1} else s for c, s in zip(chunks, shape))
   2478 
   2479     if chunks and shape is not None:

TypeError: unhashable type: 'list'
anwarnunez commented 2 years ago

Thanks @sasha-kap ! This was very helpful in diagnosing and fixing the issue.