scverse / anndata

Annotated data.
http://anndata.readthedocs.io
BSD 3-Clause "New" or "Revised" License
524 stars 150 forks source link

GPU writing #1549

Open ilan-gold opened 4 days ago

ilan-gold commented 4 days ago

Please describe your wishes and possible alternatives to achieve the desired result.

We should probably add a small change to allow GPU writing (at least with Dask) so that people don't get unexpected errors, or we should error out specifically with GPU. Right now, the error is somewhat unintuitive, like this from Ale:

/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/distributed/deploy/spec.py:324: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 8059 instead
  self.scheduler = cls(**self.scheduler_spec.get("options", {}))
/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/anndata/_core/anndata.py:271: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  self._init_as_actual(
2024-06-28 19:03:31,229 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9999, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f3521d27920>, (<class 'tuple'>, ['block_id']), (61, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98818838, 98828838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

2024-06-28 19:03:31,481 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9998, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f3520e69b20>, (<class 'tuple'>, ['block_id']), (60, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98808838, 98818838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

2024-06-28 19:03:31,652 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9997, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f34ddeb1080>, (<class 'tuple'>, ['block_id']), (59, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98798838, 98808838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

TypeError: float() argument must be a string or a real number, not 'csr_matrix'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/icb/alejandro.tejada/spatial-transformer/src/spatra/dataloader/pca_110m_rapids.py", line 172, in <module>
    write_elem(f, "X", combined.X)
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 359, in write_elem
2024-06-28 19:03:32,058 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9996, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f34dd523380>, (<class 'tuple'>, ['block_id']), (58, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98788838, 98798838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

    Writer(_REGISTRY).write_elem(store, k, elem, dataset_kwargs=dataset_kwargs)
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/anndata/_io/utils.py", line 243, in func_wrapper
2024-06-28 19:03:32,281 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9995, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f34dcd8d760>, (<class 'tuple'>, ['block_id']), (57, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98778838, 98788838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 309, in write_elem
    return write_func(store, k, elem, dataset_kwargs=dataset_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 57, in wrapper
    result = func(g, k, *args, **kwargs)
 2024-06-28 19:03:32,446 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9994, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f352016f6a0>, (<class 'tuple'>, ['block_id']), (56, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98768838, 98778838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/anndata/_io/specs/methods.py", line 355, in write_basic_dask_zarr
    da.store(elem, g, lock=GLOBAL_LOCK)
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/dask/array/core.py", line 1229, in store
    compute_as_if_collection(Array, store_dsk, map_keys, **kwargs)
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/dask/base.py", line 402, in compute_as_if_collection
    return schedule(dsk2, keys, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/distributed/client.py", line 3279, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
         2024-06-28 19:03:32,673 - distributed.worker - WARNING - Compute Failed
Key:       ('block-id-concatenate-lambda-store-map-8a43753420da751888fbb06cb003f4d6', 9993, 0)
Function:  execute_task
args:      ((<function store_chunk at 0x7f384247bf60>, (subgraph_callable-abdaf7c04824f35537d8987c6358073e, (subgraph_callable-6b46e4c55d6d79f14826a6497c94d1b3, <function read_sparse_as_dask.<locals>.make_dask_chunk at 0x7f3520b13ce0>, (<class 'tuple'>, ['block_id']), (55, 0))), <zarr.core.Array '/X' (110888864, 20310) float32>, (slice(98758838, 98768838, None), slice(0, 20310, None)), <SerializableLock: b0f2db08-9222-4aa5-a690-7632ec60456c>, False))
kwargs:    {}
Exception: "ValueError('setting an array element with a sequence.')"

     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/distributed/client.py", line 2372, in gather
    return self.sync(
           ^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/zarr/core.py", line 1450, in __setitem__
    self.set_orthogonal_selection(pure_selection, value, fields=fields)
^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/zarr/core.py", line 1639, in set_orthogonal_selection
    self._set_selection(indexer, value, fields=fields)
  ^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/zarr/core.py", line 1991, in _set_selection
    self._chunk_setitem(chunk_coords, chunk_selection, chunk_value, fields=fields)
^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/zarr/core.py", line 2259, in _chunk_setitem
    self._chunk_setitem_nosync(chunk_coords, chunk_selection, value, fields=fields)
^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/zarr/core.py", line 2263, in _chunk_setitem_nosync
    cdata = self._process_for_setitem(ckey, chunk_selection, value, fields=fields)
  ^^^^^^^^^^^^^^^^^
  File "/home/icb/alejandro.tejada/miniforge3/envs/pca110/lib/python3.11/site-packages/zarr/core.py", line 2324, in _process_for_setitem
    chunk[chunk_selection] = value
^^^^^^^^^^^
ValueError: setting an array element with a sequence.
Error raised while writing key 'X' of <class 'zarr.hierarchy.Group'> to /
2024-06-28 19:03:38,016 - distributed.nanny - WARNING - Worker process still alive after 4.0 seconds, killing
ilan-gold commented 4 days ago

It looks like we let writing for non-dask GPU so will do the same for dask then