rapidsai / build-planning

Tracking for RAPIDS-wide build tasks
https://github.com/rapidsai
0 stars 1 forks source link

NumPy 2.0 support #38

Open jakirkham opened 2 months ago

jakirkham commented 2 months ago

NumPy 2.0 is coming out soon ( https://github.com/numpy/numpy/issues/24300 ). NumPy 2.0.0rc1 packages for conda & wheels came out 2 weeks back ( https://github.com/numpy/numpy/issues/24300#issuecomment-2030603395 )

Ecosystem support for NumPy 2.0 is being tracked in issue: https://github.com/numpy/numpy/issues/26191

Also conda-forge is discussing how to support NumPy 2.0: https://github.com/conda-forge/conda-forge.github.io/issues/1997

When building against NumPy 2.0, it is possible with default settings to build packages that are compatible with NumPy 1 & 2. Where NumPy will target the oldest NumPy version that was built for that Python version being targeted


Developed the following list by installing RAPIDS 24.04 and inspecting, which packages used NumPy. Specifically ran the commands below

conda install -n base conda-tree -y
conda create -n rapids-24.04 -c rapidsai -c conda-forge -c nvidia rapids=24.04 python=3.11 cuda-version=12.2 -y
conda tree -n rapids-24.04 whoneeds numpy

This generated a list of dependencies. Some of these were RAPIDS packages themselves. So removed those from the list. Also dropped some indirect dependencies of RAPIDS. Admittedly this can get a little subjective. Though tried to capture a sufficiently complete, though not overly detailed, picture


From this, have built the table below

Some versions have questions marks if...

Blank entries mean no information is known about those fields at this time

Package Supported Released Version Upstream issue/PR
Arrow Y Y 16.0.0 https://github.com/apache/arrow/issues/39532
Bokeh Y Y 3.4.1 https://github.com/bokeh/bokeh/issues/13835
Branca Y Y 0.7.2 https://github.com/python-visualization/branca/pull/163
CuPy Y Y 13.2.0 https://github.com/cupy/cupy/issues/8306
Dask Y Y 2024.5.1 https://github.com/dask/dask/issues/11066
Datashader Y Y 0.16.2 https://github.com/holoviz/datashader/issues/1324
folium Y Y? 0.16.0? https://github.com/python-visualization/folium/issues/1937
GDAL Y Y 3.9.0 https://github.com/OSGeo/gdal/issues/9751
HoloViews Y Y 1.19.0 https://github.com/holoviz/holoviews/pull/5979 & https://github.com/holoviz/holoviews/pull/6238
Hypothesis Y Y 6.100.2 https://github.com/HypothesisWorks/hypothesis/pull/3955
imagecodecs Y Y 2024.6.1 https://github.com/cgohlke/imagecodecs/issues/100
imageio Y Y 2.34.2 https://github.com/imageio/imageio/issues/1077
mapclassify Y Y 2.6.1? https://github.com/pysal/mapclassify/pull/188
Matplotlib Y Y 3.8.4 https://github.com/matplotlib/matplotlib/issues/26778
Numba Y Y 0.60.0 https://github.com/numba/numba/issues/9544
Pandas Y Y 2.2.2 https://github.com/pandas-dev/pandas/issues/55519
PyTorch Y Y 2.3.0 https://github.com/pytorch/pytorch/issues/107302
PyWavelets Y Y 1.6.0 https://github.com/PyWavelets/pywt/pull/731
scikit-image Y Y 0.23.1 https://github.com/scikit-image/scikit-image/issues/7282
scikit-learn Y Y 1.4.2 https://github.com/scikit-learn/scikit-learn/issues/27075
SciPy Y Y 1.13.0 https://github.com/scipy/scipy/pull/20375
Shapely Y Y 2.0.4? https://github.com/shapely/shapely/issues/1972
TensorFlow https://github.com/tensorflow/tensorflow/issues/67291
tifffile Y Y 2024.4.24 https://github.com/cgohlke/tifffile/issues/252
treelite Y Y 4.2.1 https://github.com/dmlc/treelite/issues/560
Xarray Y Y 2024.06.0 https://github.com/pydata/xarray/issues/8844
XGBoost Y Y 2.1.0 https://github.com/dmlc/xgboost/issues/10221

Note to editors: Also attaching the CSV file used to generate this table (as editing Markdown tables can be tricky 😅). Would suggest making any changes in the CSV file locally (with Excel or other). Then you can use prettytable (available on PyPI & Conda-forge) to generate Markdown with this code. The resulting content can be copy-pasted above. Can drag and drop the CSV file into this textbox to attach it

prettytable code: ```python import prettytable with open("rapids_numpy_pkgs.csv", "r") as fh: t = prettytable.from_csv(fh, delimeter=",", lineterminator="\n") t.set_style(prettytable.MARKDOWN) with open("rapids_numpy_pkgs.md", "w") as fh2: fh2.write(str(t)) ```
jameslamb commented 1 month ago

Thanks @jakirkham ! I think this is a great approach.

I looked through this list of dependencies today and can't think of any others or a different approach to identify them. And I checked the statuses of all the not-yet-released ones and don't see any changes.

jakirkham commented 1 month ago

Went through the project list again earlier today and also now

Main changes were the GDAL release went out

Also Numba RCs are available

Dask may work with NumPy 2, but needs reconfirmation

Added a better issue link for TensorFlow

Tried to also split apart when upstream has fixes (like Dask or XGBoost) from whether they are released. Hopefully that gives a bit more visibility into the state of NumPy 2 support

rgommers commented 1 month ago

dask 2024.5.1, datashader 0.16.2, imagecodecs 2024.6.1, and treelite 4.2.1 all have numpy 2.0-compatible releases out now.

vyasr commented 1 month ago

Thanks Ralf!

jakirkham commented 1 month ago

Thanks for the reminder! 🙏

Have refreshed the table above

rgommers commented 3 weeks ago

Numba 0.60.0, Xarray 2024.6.0, and CuPy 13.2.0 were all released. So looks like things are mostly good here (a few rough edges left).

rgommers commented 1 week ago

Imageio and XGBoost were both released as well.

FirefoxMetzger commented 1 week ago

You can cross ImageIO off the list ... we now support numpy v2.0 as of ImageIO v2.34.2

jakirkham commented 1 week ago

Thanks all! 🙏

Have refreshed the list

Looks like we are down to TensorFlow. This is only needed in some cases. So think it makes sense to start doing this work at this point

seberg commented 4 days ago

It seems hdbscan is a dependency cuml has. I will push them to fix that though, they probably just need to redo their wheel build (but instead just added a numpy<2 pin for now). (Turns out there are some other issues, although they don't seem NumPy 2 related.)

xref https://github.com/scikit-learn-contrib/hdbscan/pull/644 (fix, but remaining issue) xref https://github.com/scikit-learn-contrib/hdbscan/issues/642 (issue)