xarray-contrib / cupy-xarray

Interface for using cupy in xarray, providing convenience accessors.
https://cupy-xarray.readthedocs.io
Apache License 2.0
68 stars 13 forks source link

Let users decide which cupy to use #66

Open relativityhd opened 2 days ago

relativityhd commented 2 days ago

As of right now, there are three pip install packages which one can use to install cupy: cupy which will build cupy from source and the two prebuild versions cupy-cuda12x and cupy-cuda11x. This library only allows for cupy to be installed. This clashes with other libraries, e.g. cucim, which only allows installing with prebuild options. Also, having only the source building cupy option available introduces the risk of build-failures on systems which would work fine with the prebuild ones and costs time when installing this library.

Therefore, I would like to present two possible solutions.

First: Remove cupy dependency completely and change the installation instructions that the user must cupy by themselves.

Second: Remove cupy dependency from the dependency list and introduce 3 new optional dependencies:

pyproject.toml:

...
dependencies = [
    "cupy",
    "xarray>=2024.02.0",
]

[project.optional-dependencies]
test = [
    "dask",
    "pytest",
]
source = [
    "cupy"
]
cuda12 = [
    "cupy-cuda12x",
]
cuda11 = [
    "cupy-cuda11x",
]

...

That way, a user can install this library with e.g. pip install cupy-xarray[cuda12]

I would welcome any feedback and other ideas on this. If you make a decision about this, I would be happy to implement it and open a pull request.

jacobtomlinson commented 2 days ago

This is a very good point! I would personally lean towards option 1, but we should definitely add some runtime checking and a helpful error message if cupy is missing.

msarahan commented 2 days ago

Extras are kind of annoying because there's no default value. https://discuss.python.org/t/adding-a-default-extra-require-environment/4898

I have no opinion on what the best way forward on this particular library might be, but there is an effort towards expanding the metadata that can be captured and used for package selection. https://discuss.python.org/t/selecting-variant-wheels-according-to-a-semi-static-specification/53446/111 is one proposal. That has some implementation in the repos at https://github.com/wheel-next. That proposal depends on the concept of index priority as a way of avoiding complicating the current solver implementations. Because of that, we've had to also try to get index priority to be better accepted. https://discuss.python.org/t/pep-766-handling-multiple-indexes-index-priority/71589

It's highly unlikely that any of this will help solve your concern in the short or even medium term, but I hope you'll get involved in the upstream issues and proposals that I have linked if you are so inclined.

dcherian commented 2 days ago

Like Jacob, I lean towards (1) because it's less maintenance work. We should still import cupy and raise a nice error saying that cupy must be manually installed.