cellarium-ai / cellarium-ml

Distributed single-cell data analysis.
BSD 3-Clause "New" or "Revised" License
11 stars 2 forks source link

OSError: Can't synchronously read data (anndata file) #183

Open ordabayevy opened 4 months ago

ordabayevy commented 4 months ago

This error happens when training multiple models on the same dataset (google bucket).

  File "/mnt/disks/dev/repos/cellarium-ml/cellarium/ml/data/fileio.py", line 38, in read_h5ad_gcs
    with blob.open("rb") as f:
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 261, in read_h5ad
    adata = read_dispatched(f, callback=callback)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/experimental/_dispatch_io.py", line 48, in read_dispatched
    return reader.read_elem(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/utils.py", line 207, in func_wrapper
    return func(*args, **kwargs)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 256, in read_elem
    return self.callback(read_func, elem.name, elem, iospec=iospec)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 242, in callback
    **{
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 245, in <dictcomp>
    k: read_dispatched(elem[k], callback)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/experimental/_dispatch_io.py", line 48, in read_dispatched
    return reader.read_elem(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/utils.py", line 207, in func_wrapper
    return func(*args, **kwargs)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 256, in read_elem
    return self.callback(read_func, elem.name, elem, iospec=iospec)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 259, in callback
    return func(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 604, in read_sparse
    return sparse_dataset(elem).to_memory()
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_core/sparse_dataset.py", line 532, in to_memory
    mtx.data = self.group["data"][...]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/h5py/_hl/dataset.py", line 758, in __getitem__
    return self._fast_reader.read(args)
  File "h5py/_selector.pyx", line 376, in h5py._selector.Reader.read
OSError: Can't synchronously read data (inflate() failed)
Error raised while reading key 'X' of <class 'h5py._hl.group.Group'> from /
ordabayevy commented 4 months ago

There was a similar error before:

  File "/mnt/disks/dev/repos/cellarium-ml/cellarium/ml/data/fileio.py", line 38, in read_h5ad_gcs
    return read_h5ad(f)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 261, in read_h5ad
    adata = read_dispatched(f, callback=callback)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/experimental/_dispatch_io.py", line 48, in read_dispatched
    return reader.read_elem(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/utils.py", line 207, in func_wrapper
    return func(*args, **kwargs)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 256, in read_elem
    return self.callback(read_func, elem.name, elem, iospec=iospec)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 242, in callback
    **{
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 245, in <dictcomp>
    k: read_dispatched(elem[k], callback)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/experimental/_dispatch_io.py", line 48, in read_dispatched
    return reader.read_elem(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/utils.py", line 207, in func_wrapper
    return func(*args, **kwargs)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 256, in read_elem
    return self.callback(read_func, elem.name, elem, iospec=iospec)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 259, in callback
    return func(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 304, in read_mapping
    return {k: _reader.read_elem(v) for k, v in elem.items()}
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 304, in <dictcomp>
    return {k: _reader.read_elem(v) for k, v in elem.items()}
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/utils.py", line 207, in func_wrapper
    return func(*args, **kwargs)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 256, in read_elem
    return self.callback(read_func, elem.name, elem, iospec=iospec)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 259, in callback
    return func(elem)
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 378, in read_array
    return elem[()]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/yordabay/anaconda3/envs/cellarium/lib/python3.10/site-packages/h5py/_hl/dataset.py", line 841, in __getitem__
    self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 242, in h5py.h5d.DatasetID.read
  File "h5py/_proxy.pyx", line 113, in h5py._proxy.dset_rw
OSError: Can't synchronously read data (wrong B-tree signature)
Error raised while reading key 'measured_genes_mask' of <class 'h5py._hl.dataset.Dataset'> from /layers