unum-cloud / usearch

Fast Open-Source Search & Clustering engine Γ— for Vectors & πŸ”œ Strings Γ— in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram πŸ”
https://unum-cloud.github.io/usearch/
Apache License 2.0
2.15k stars 130 forks source link

USearch v3: Text Indexes, Faster Views, Cleaner API and Docker #277

Closed ashvardanian closed 11 months ago

ashvardanian commented 1 year ago

The v3 release of USearch is bringing several new features, and provides an excellent opportunity to refactor existing ones!

Candidates:

var77 commented 1 year ago

Thank you for your work! It will also be great to have a batch add functionality in Rust bindings.

monatis commented 1 year ago

Thanks for the awesome work!

  • [ ] Docker image with a UCall

Will this support multiple indexes / collections? Like endpoints for creating a collection (vector index), listing available collections, searching in the given collection etc. all in a single Docker container.

ashvardanian commented 1 year ago

@monatis, yes, that’s the plan πŸ€—

monatis commented 1 year ago

Super news. It's already great for embedded use but a dockerized service mode can cater for new use cases.

ashvardanian commented 12 months ago

I've released SimSIMD v2, which will be included in USearch v3. It brings several performance and versatility improvements and may soon feature Intel AMX support and more bitwise metrics.

philippemnoel commented 12 months ago

Thank you for your work! Do you have an estimate on when USearch v3 will be ready?

ashvardanian commented 12 months ago

Thank you, @philippemnoel! I plan to release it before the end of October.

Here is the progress so far:

Once we fix the build issues in UCall CI, all the pieces will fall into place, and the release will be easier to prepare. Feel free to help there πŸ€—

aehlke commented 11 months ago

Would be amazing to prioritize langchainJS integration. I can run it on web and mobile and desktop easily with that. You'll get attention more quickly from their many users.

ashvardanian commented 11 months ago

@aehlke, I believe USearch is already available in LangChainJS.

aehlke commented 11 months ago

Ah I meant for this new text search functionality - maybe there's nothing to be added on their side. Looking forward to this PR, thanks for the work.

ashvardanian commented 11 months ago

:tada: This PR is included in version 2.8.3 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

monatis commented 10 months ago

Hi @ashvardanian it seems that this pr is merged without V3 changes. I just wanted to ask if you are still planning to release them and if you have a new ETA?

I was thinking of using USearch in an upcoming product but a stable file format might be important.

Not necessarily related to V3, but are you planning to support incrementally adding vectors in memmaped files, or what is the recommended procedure to add new vectors after the initial upload?

aehlke commented 10 months ago

Is there an update on text index and related updates originally planned? Thanks

ashvardanian commented 10 months ago

Hello @monatis and @aehlke,

I hope you're both doing well. The past few weeks have been incredibly busy as we wrapped up our year-long projects. You might have come across some of these initiatives:

Collaborating with large organizations has its challenges, particularly in terms of pace. This has required us to adjust our timelines slightly. Nonetheless, progress on v3 is steady, along with other updates and integrations, including contributions to SciPy and a potential integration with Sklearn.

Beating existing solutions in vector and text search, clustering, dimension reduction, and external memory access is doable. But achieving that in one package under 10,000 lines of maintainable code compatible with every OS and hardware architecture is very tricky. I want to make sure the design persists for years, so I'm not rushing to make sure we get it right.

Thank you for your continued support and patience! πŸ€—

aehlke commented 10 months ago

I appreciate the context, best of luck!