Closed yungyuc closed 2 weeks ago
My current project progress includes finding a Faiss method that achieves 100% similarity with the results of linear search, allowing me to make comparisons (this took more time than expected, as most Faiss methods generally carry an error margin of about 4-5%). All current comparison functions use cosine similarity.
I have designed a simple benchmark that simulates multiple search parameters and checks if all search results (top_k) are identical, a method that can tolerate floating-point level discrepancies.
I have also implemented the first version of my own method, but I didn’t do anything particularly new—I simply re-implemented Python’s linear search in C++ STL. The underlying data structure is still using vector
Additionally, after briefly reviewing the official documentation, I found that Faiss does not seem to support the remove operation, so I plan to skip that part for now.
The next step should be to study Faiss’s methods more closely to understand what optimizations they’ve made and where they achieve their speed. Since I am due to present a paper in our lab soon, I think reviewing their research paper would be a good approach.
Sounds good. Reviewing what you did is a solid way to plan ahead. But it will be more productive to focus on creating issues. Actions matter.
Could you please create GitHub issues to track what you planned to do?