activeloopai / deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
https://activeloop.ai
Mozilla Public License 2.0
8.08k stars 616 forks source link

Manage temp tensor files in memory rather than sending them to storage #2819

Open nvoxland-al opened 6 months ago

nvoxland-al commented 6 months ago

🚀 🚀 Pull Request

Impact

Description

With a large number of temp tensors, the on-disk metadata management gets time consuming. This PR avoids the overhead by keeping them in-memory.

Things to be aware of

Does not attempt to limit the temp tensor cache, but they are currently only used for class_labels which will not be large amounts of data

nvoxland-al commented 6 months ago

Currently does not work with scheduler=processed. Going to get feedback before looking at handling that better.

codecov[bot] commented 6 months ago

Codecov Report

Attention: Patch coverage is 96.03175% with 5 lines in your changes are missing coverage. Please review. Files Patch % Lines
deeplake/core/storage/provider.py 94.44% 3 Missing :warning:
deeplake/core/storage/local.py 92.30% 1 Missing :warning:
deeplake/core/storage/lru_cache.py 90.90% 1 Missing :warning:

:loudspeaker: Thoughts on this report? Let us know!

sonarcloud[bot] commented 5 months ago

Quality Gate Passed Quality Gate passed

Issues
4 New issues
0 Accepted issues

Measures
0 Security Hotspots
95.4% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud