hora-search / hora

🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 .
http://horasearch.com/
Apache License 2.0
2.59k stars 73 forks source link

Possible to add sparse elements? #39

Open rob-p opened 2 years ago

rob-p commented 2 years ago

Thanks for the very nice library! I'm interested in using hora for doing nearest neighbor finding in single-cell genomics. The data of interest consist of very high dimensional points (D = 30,000), but for most points, most dimensions have value 0. Therefore, I'd like to avoid (it's not really feasible) to densify the elements before indexing them. Is there some way to provide a custom implementation of the relevant distance metrics for the indexed type such that I don't have to actually insert a dense representation of the points into the index?

kacperlukawski commented 2 years ago

The project seems not to be maintained anymore, but since we're doing something similar at Qdrant (https://github.com/qdrant/qdrant), I think I may answer that question. Those tools are rather designed to support neural embeddings and they typically won't be sparse.