cupy / cupy

NumPy & SciPy for GPU
https://cupy.dev
MIT License
8k stars 792 forks source link

[Tracker] Implement all `scipy.*` APIs in CuPy #6324

Open kmaehashi opened 2 years ago

kmaehashi commented 2 years ago

Implement GPU version of scipy.* functions in cupyx.scipy.* namespace.

This is a tracker issue that summarizes the implementation status of each SciPy public module in CuPy. See the comparison table for details.

Legends

List of Modules

Note: Several modules in SciPy such as scipy.cluster.hierarchy accepts CuPy ndarrays as inputs through the Array API standard.

Starter Task

If you are new to CuPy and wondering where to start, here are some good first things to try. They are relatively simple and independent.

Steps to Contribute

  1. Fork and star :star: the CuPy repository :wink:

  2. Pick a function you want to work on from any of the modules listed above. You can find the function in the SciPy API Reference to understand what should be implemented. If you want to work on new modules (🥚), first discuss with us on this issue.

  3. Implement a function in your branch. If you need help, join Gitter or just ask for help in this issue.

  4. Don't forget to write a test code!

  5. Build CuPy and run tests to confirm that the function runs fine: pip install -e . && pytest tests/cupyx_tests/scipy_tests/PATH_TO_YOUR_TEST See the Contribution Guide for details.

  6. Submit a pull-request to the main branch.

Note that you will need a GPU environment to develop CuPy.

See also: NumPy API Tracker Issue #6078

leofang commented 2 years ago

cc: @cjnolet @quasiben

rgommers commented 2 years ago

This looks great, thanks @kmaehashi!

Miscellaneous routines (scipy.misc)

Note that we're quite likely deprecating this module in SciPy 1.9.0 (see https://github.com/scipy/scipy/issues/15608), so I suggest not working on that.

kmaehashi commented 2 years ago

Thanks for the info, @rgommers! Updated the list.

cjnolet commented 2 years ago

@kmaehashi, RAFT contains hierarchical clustering code as well: https://github.com/cupy/cupy/issues/3434.

RAFT also contains functions for computing contingency table and sampling from various distributions. Wondering if those might be useful here as well.

kmaehashi commented 2 years ago

Thanks @cjnolet! Added a link to the issue.

Pranavchiku commented 10 months ago

This is brilliant and interesting! I am Pranav, developer @lfortran, we are trying to compile scipy using LFortran, we are very close, almost got scipy.special.specfun working. I wish to ask, if we (cupy contributors) are translating fortran implementation present in SciPy to make it use cupy apis, right?

kmaehashi commented 10 months ago

Hi @Pranavchiku! Yes, we are rewriting SciPy code into CUDA C/C++. You'll get the idea by exploring our code base under cupyx/scipy: https://github.com/cupy/cupy/tree/main/cupyx/scipy/special

Pranavchiku commented 10 months ago

Note that you will need a GPU environment to develop CuPy.

Do we need explicit GPU environment, or integrated GPUs work? By integrated I mean Apple GPU that is available with mac m1, m2, etc. chips.

kmaehashi commented 10 months ago

Discrete GPUs are required, currently we support CUDA and HIP (for AMD). https://docs.cupy.dev/en/latest/overview.html

Zhyrek commented 8 months ago

Hey, I need non-batched BLAS functions for what I'm working on, so I will be working on implementing those in the near future. Just wondering if there are any thoughts on a best-practices for implementing these.

Roughly how I imagine using these functions (collapsible code section): ```py @jit.rawkernel(device=True) def square(a): matrix2 = cupyx.jit.local_memory(float, (2, 2)) #I'll also try implementing multi-dimensional local/shared arrays matrix2[0][0] = a matrix2[1][1] = a return cupyx.scipy.linalg.det(matrix2) #can be used in device functions @jit.rawkernel() def elementwise_2_x_squared(x, size): #computes 2*x^2, toy example which uses linalg functions tid = jit.grid(1) ntid = jit.gridsize(1) matrix = cupyx.jit.local_memory(float, (2, 2)) for i in range(tid, size, ntid): stored_val = x[i] x[i] = square(stored_val) matrix[0][0] = stored_val matrix[1][1] = stored_val x[i] += cupyx.scipy.linalg.det(matrix) #can be used inside raw kernel c = cp.array([1., 2.]) elementwise_2_x_squared((256,), (256,), (c, 2)) print(c) #should return [2., 8.] ```

The main use case I have is roughly: I have a large array of scalar values (could be hundreds of millions). For each value, I need to generate a (potentially not small) matrix, solve it, and then use the result to compute a scalar output value. For batched code, I would need to store these matrices altogether (potentially hundreds of GB memory - not feasible), while for non-batched, dynamically generated matrices, the memory requirements would just be the scalar values.

rcremese commented 8 months ago

Hello, I'm quite new to the open-source community but I would like to contribute to this project. Is anybody on the integration of scipy.constants module ?

ev-br commented 3 months ago

It seems possible to mark scipy.interpolate, scipy.signal and scipy.signal.windows as :shipit: now?

ev-br commented 3 months ago

Also, I wonder if the comparison table (super-useful BTW!) , https://docs.cupy.dev/en/stable/reference/comparison.html#scipy-cupy-apis is updated automatically on a release, or is it manual?

Would be great to remove SciPy deprecated items (signal.cmplx_sort, windows from scipy.signal namespace, linalg.tri{l,u} etc)

Pranavchiku commented 3 months ago

Hey all, I was always willing to contribute here and I have my first PR https://github.com/cupy/cupy/pull/8305 opened that ports scipy.stats.kstat. I do not have access to GPUs hence have not tested locally, I followed boxcox_llf and it was fun and easy :)

kmaehashi commented 3 months ago

It seems possible to mark scipy.interpolate, scipy.signal and scipy.signal.windows as :shipit: now?

Good catch! Updated the list :shipit:

Also, I wonder if the comparison table (super-useful BTW!) , https://docs.cupy.dev/en/stable/reference/comparison.html#scipy-cupy-apis is updated automatically on a release, or is it manual?

latest page is updated automatically for each commit merged into main branch. stable page is updated automatically for each stable (v13.*) release.

Would be great to remove SciPy deprecated items (signal.cmplx_sort, windows from scipy.signal namespace, linalg.tri{l,u} etc)

The page is generated with this tool. Deprecated items are currently listed in footnotes section to describe why we don't implement them. https://github.com/cupy/cupy/blob/1aa70f0de8c78afcfa503837d9baee3d80af4bc8/docs/source/_comparison_generator.py#L206-L207

kmaehashi commented 3 months ago

Hey all, I was always willing to contribute here and I have my first PR #8305 opened that ports scipy.stats.kstat. I do not have access to GPUs hence have not tested locally, I followed boxcox_llf and it was fun and easy :)

Hi @Pranavchiku, thank you for your interest in contributing! We, however, would like to kindly ask all contributors to build and test the branch before submitting a pull request. I understand that there are situations where access to GPUs is difficult but in that case please consider using offerings such as Google Colab.

ev-br commented 2 months ago

Thanks @kmaehashi A couple of comments, now that I looked at this tracker a bit more:

kmaehashi commented 2 months ago

Thanks for the suggestions @ev-br!

  • With KDTree and Delaunay implemented, maybe scipy.spatial is fair to be marked as :shipit:, too. Submodules (spatial.transform and spatial.distance might be a separate story, don't have an opinion)

Absolutely, marked :shipit:!

  • scipy.linalg.{blas, lapack} are unlikely to be implemented at all, correct?

Can't really say they are unlikely at all, as we have non-public (undocumented) cupyx.lapack module. I think we can say they're considered low-priority, though.

  • even more so for scipy.linalg.cython_{blas,lapack}. Maybe even add an "out of scope" type symbol, ❌ or some such?

Sounds great, marked ❌.

  • scipy.io --- is there an appetite for ever adding functionality for reading from matlab to CuPy? I suppose using scipy.io followed by cupy.asarray is the way, is it?

I began to wonder if I could delete it from the list, as the audience is very limited compared to NumPy's I/O routines (e.g., cupy.load which we implemented by wrapping numpy.load).

  • Several SciPy subpackages support CuPy array inputs via Array API, e.g. scipy.cluster. It might be nice to show these too here?

Nice, added a link to the Array API support docs!