activeloopai / deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
https://activeloop.ai
Mozilla Public License 2.0
7.87k stars 605 forks source link

Support for Numpy 2.0 #2881

Closed nvoxland-al closed 2 weeks ago

nvoxland-al commented 2 weeks ago

πŸš€ πŸš€ Pull Request

Impact

Description

Numpy 2.0 changes some of it's behavior in ways that are not compatible with the existing deeplake code.

This PR updates deeplake to be compatible with both 1.x and 2.0

Things to be aware of

Numpy removed support for can_cast() between python types and numpy types as part of https://numpy.org/neps/nep-0050-scalar-promotion.html . This PR makes deeplake's cast checks still support a comparison between a python sample and a dtype, but follows the new numpy standard of not taking the value into account.

Because python ints/floats are np int64/float64s, this means that you can no longer add a python int to an int32 tensor -- you must explicitly create it as an np.int32.

This better matches other codepaths deeplake already has, so is bringing more consistency. But, there may be cases where doesn't work with existing scripts.

Things to worry about

Additional Context

sonarcloud[bot] commented 2 weeks ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

nvoxland-al commented 2 weeks ago

Closing this in favor of a < 2.0 constraint #2880 for now.

Will revisit this as demand for numpy 2.0 increases