milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.22k stars 2.9k forks source link

Add support for ScaNN index #2771

Open SaiKiranBurle opened 4 years ago

SaiKiranBurle commented 4 years ago

Is your feature request related to a problem? Please describe. Google recently released a new algorithm to find Approximate Nearest Neighbors called ScaNN (https://github.com/google-research/google-research/tree/master/scann). They show through benchmarks that it can perform significantly better than the existing solutions like Annoy, FAISS, hnsw.

Describe the solution you'd like Add ScaNN as a new index type for milvus.

Describe alternatives you've considered None

Additional context The paper which describes ScaNN how it is different from existing approaches is at https://arxiv.org/pdf/1908.10396.pdf

JinHai-CN commented 4 years ago

@SaiKiranBurle Good, we are investigating this algorithm and considering to integrate it into Milvus.

sa- commented 3 years ago

Google has a managed service that implements ScaNN in an alpha state btw.

SaiKiranBurle commented 3 years ago

@sa- Can you point me to that said managed service?

sa- commented 3 years ago

Added you on LinkedIn as Samay Kapadia, I'll send a link through there

rostandkenne commented 3 years ago

@JinHai-CN @SaiKiranBurle @sa- I was working on a standalone Scann service but just realized that Faiss has released a version that outperform Scann. https://github.com/facebookresearch/faiss/issues/1399

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

raynor08 commented 2 years ago

Hi, any update or progress on this?

xiaofan-luan commented 2 years ago

let's keep it open~ We are waiting for volunteers

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

xiaofan-luan commented 2 years ago

keep it open. Any progress on it?

filip-halt commented 1 year ago

Any status updates?

xiaofan-luan commented 1 year ago

Faiss ScaNN will be supported in 2.3

xiaofan-luan commented 1 year ago

/assign @liliu-z

vamossagar12 commented 1 year ago

hello @xiaofan-luan , I was actually working on this some time ago. Here's a branch which has the changes: https://github.com/vamossagar12/knowhere/tree/MEP-15. I had to stop this because I have an M1 laptop and SCANN libraries were not building. It seems to be an open issue as mentioned here: https://github.com/google-research/google-research/issues/1082.
Looks like we have changed to using FAISS SCANN which hopefully has better APIs to work against. I was using the ones provided by google research and it had only python APIs and I had to reverse engineer a lot to integrate those into the codebase. @liliu-z , if you need any help/assistance, let me know would be happy to contribute still. I would need to refresh my memory but should be a good exercise :D

xiaofan-luan commented 1 year ago

hello @xiaofan-luan , I was actually working on this some time ago. Here's a branch which has the changes: https://github.com/vamossagar12/knowhere/tree/MEP-15. I had to stop this because I have an M1 laptop and SCANN libraries were not building. It seems to be an open issue as mentioned here: google-research/google-research#1082. Looks like we have changed to using FAISS SCANN which hopefully has better APIs to work against. I was using the ones provided by google research and it had only python APIs and I had to reverse engineer a lot to integrate those into the codebase. @liliu-z , if you need any help/assistance, let me know would be happy to contribute still. I would need to refresh my memory but should be a good exercise :D

Thanks for the contribution @vamossagar12 . Faiss ScaNN seems to be faster and as you mentioned it could work on M1 CPUs. we will release the Faiss ScaNN support soon! chears

vamossagar12 commented 1 year ago

Thanks @xiaofan-luan for the update. I am not sure if FAISS SCANN was available when I was working on this (or maybe it was I didn't check). Nonetheless, I would leave this to the experts then :)

sualehasif commented 1 year ago

@xiaofan-luan do you know when 2.3 would be out? Is there a beta that we can test? Also are there guides on how to performance tune Milvus to get sub-10ms query times?

xiaofan-luan commented 1 year ago

@xiaofan-luan do you know when 2.3 would be out? Is there a beta that we can test? Also are there guides on how to performance tune Milvus to get sub-10ms query times?

Hi @sualehasif From our test search under sub-10ms is possible, and the trick is to maintain lower cpu usage and use HNSW index(< 50%). 2.3 will be released some time this March, we will be working actively on ScaNN and GPU support and let's see if we can catch the train If you can share more details, like the dimension of the vector data, how much machines you have and the data size, that will be super helpful

BobLiu20 commented 1 year ago

@xiaofan-luan Any update for ScaNN ?

liliu-z commented 1 year ago

Hi @BobLiu20 , Sorry about the delay, and we target supporting this in Q3 2023 for now. /assign @chasingegg

liliu-z commented 7 months ago

SCANN is supported from Milvus 2.3.