zarr-developers / zarr-python

An implementation of chunked, compressed, N-dimensional arrays for Python.
http://zarr.readthedocs.io/
MIT License
1.41k stars 267 forks source link

Initial GPU support #1967

Open akshaysubr opened 1 month ago

akshaysubr commented 1 month ago

Adding an initial implementation to support GPU arrays. Currently limited to only arrays that support the __cuda_array_interface__ (cupy, numba, pytorch). Can extend this to supporting dlpack as well later for JAX and Tensorflow support.

TODO:

d-v-b commented 1 month ago

thanks for working on this @akshaysubr. what would we need to change on the CI side to run gpu tests?

akshaysubr commented 4 weeks ago

@d-v-b For CI, a GPU runner would be needed and the pipeline should install the optional [gpu] dependencies (just cupy).

rabernat commented 4 weeks ago

I would have thought that this line

https://github.com/zarr-developers/zarr-python/blob/143faeaf9007657cd688ea8afa2cc62e1135ae2b/.github/workflows/test.yml#L26

Would have picked up this new optional dependency

https://github.com/zarr-developers/zarr-python/blob/fe8591a877a7acd636f65d1f1c208254ba2095a6/pyproject.toml#L63-L65

But that didn't happen in this CI run: https://github.com/zarr-developers/zarr-python/actions/runs/9555282123/job/26338638922?pr=1967#step:5:1

d-v-b commented 4 weeks ago

But I wonder if we should move all of this to its own file: gpu_buffer.py ?

I could imagine something like this:

src/zarr/buffer 
|- gpu.py
|   |- Buffer
|- cpu.py
|   |- Buffer

leading to usage like

from zarr.buffer import cpu, gpu
...
cpu.Buffer()
...
gpu.Buffer()
akshaysubr commented 1 week ago

@d-v-b Thanks for the suggestion to refactor into two separate files. I made those changes and added some more that I think benefit the overall usage of these buffer classes. Essentially now, zarr.buffer.Buffer is an abstract class and zarr.buffer.cpu.Buffer and zarr.buffer.gpu.Buffer are concrete implementations which makes things cleaner. It also requires implementations to be specific about which type of Buffer they intend on using which I think is a good thing.