Open applecv3 opened 4 years ago
The plug-in uses pure cosine-similarity or dot-product to compare vectors. So the K nearest neighbors it returns are the exact K, not any assessment like LSH and others
On Wed, Oct 28, 2020, 10:17 AM Seung notifications@github.com wrote:
Hi, I just want to know which type of KNN (like HNSW, LSH, and so forth) you built in this plugin.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lior-k/fast-elasticsearch-vector-scoring/issues/58, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGGISFY5JHKWI2HXMOZ7QLSM7HQJANCNFSM4TB7GUFA .
Thank you for your answer! So.. let me ask you some more. Do you mean naive KNN searching algorithm by "pure cosine-similarity"? Is it taking O(N) time complexity? (where N is the number of documents to explore when computing cosine similarity). If so, I'm not sure how your plugin works faster than the others and I saw you mentioned that "I gained this substantial speed improvement by using the lucene index directly". Does that imply all the secrets(?) about how this plugin works fast?
Yes, it uses brute force to calculate cosine-similarity. Meaning O(n) It is not faster than hnswlib or fasis etc... It is faster then other ES plugins that did the same brute force calculations. The only difference was using the lucene engine. You can see the code :-)
BTW - Amazon has an hnswlib implementation on their manages ES implementation. It should be much faster than this but it has limitations
On Thu, Oct 29, 2020, 7:41 AM Seung notifications@github.com wrote:
Thank you for your answer! So.. let me ask you some more. Do you mean naive KNN searching algorithm by "pure cosine-similarity"? Is it taking O(N) time complexity? (where N is the number of documents to explore when computing cosine similarity). If so, I'm not sure how your plugin works faster than the others and I saw you mentioned that "I gained this substantial speed improvement by using the lucene index directly". Does that imply all the secrets(?) about how this plugin works fast?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lior-k/fast-elasticsearch-vector-scoring/issues/58#issuecomment-718373084, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGGISDM74QJSGQ6ZL4BG73SND6B3ANCNFSM4TB7GUFA .
Thank you so much! I really appreciate it. Have a good day!
BTW, we use k-means with this plug-in inorder to traverse only the input vector nearest clusters instead of the entire corpus.
On Fri, Oct 30, 2020, 2:21 AM Seung notifications@github.com wrote:
Thank you so much! I really appreciate it. Have a good day!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lior-k/fast-elasticsearch-vector-scoring/issues/58#issuecomment-719096909, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGGISBAMEYC6S4LXKMQT5TSNIBHVANCNFSM4TB7GUFA .
@lior-k Hi,
What is the difference between this repo and the native ES vector scoring? Which one is faster?
Thanks
Never tested. This plugin existes way before the official support. If you do test the performance differences please let us all know 🙏
On Thu, Jan 14, 2021, 5:20 PM mz notifications@github.com wrote:
@lior-k https://github.com/lior-k Hi,
Whats is the difference between this repo and the native ES vector scoring? Which one is faster?
Thanks
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lior-k/fast-elasticsearch-vector-scoring/issues/58#issuecomment-760263975, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGGISAXPRZQ3HYDQMU5E6LSZ4DS3ANCNFSM4TB7GUFA .
Whether the plug-in can perform algorithm configuration, use brute force to calculate cosine similarity, not suitable for high-efficiency scenarios # @lior-k
Hi, I just want to know which type of KNN (like HNSW, LSH, and so forth) you built in this plugin.