unum-cloud / usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
https://unum-cloud.github.io/usearch/
Apache License 2.0
2.27k stars 142 forks source link

Documenting Memory Usage #244

Closed ashvardanian closed 11 months ago

ashvardanian commented 1 year ago

Describe what you are looking for

Recent benchmarks by @davvard suggest, that even for the same f32 vectors USearch uses up to 4x less memory than FAISS. We should double-check those numbers and measure the memory consumption on the same test dataset we have used in the FAISS comparison on the front page.

Can you contribute to the implementation?

Is your feature request specific to a certain interface?

It applies to everything

Contact Details

No response

Is there an existing issue for this?

Code of Conduct

patelprateek commented 1 year ago

@ashvardanian : 4x less memory consumption looks amazing . I would like to try this in some production use case. Can you give some pointers to why reduced memory usage ? Also how does this compare with nmslib/hnswlib. When you mention faiss which particular ann algorithm you compare against (i assume faiss's implementation of hnsw)

ashvardanian commented 1 year ago

@patelprateek, you are right, we are comparing against IndexHNSW. The memory consumption is much lower, cause we don’t use any STL containers and default memory allocators. Everything is handcrafted for our HNSW :)

ashvardanian commented 11 months ago

@patelprateek, we’ve recently published a large study, describing the tradeoff between speed, recall and memory usage along other things. We align along the recall curves, and in that case memory consumption isn’t better, but speed can be orders of magnitude higher.

It’s probably worth comparing with other engines as well, but we are currently focused more on features, than broader benchmarks. Don’t hesitate to reach out and share your use-case with us on Discord 🤗