Open xiaofan-luan opened 1 year ago
Hi @xiaofan-luan I am interested in contributing! and would like to know how to help! let me know how to get started.
cool man! I though @liliu-z could offer you some help
you have any experience on cpp and some any idea about HSNW algorithm yet?
Hi I had taken a cpp course in college, I have primarily worked as a Java developer ~3 ish years, so i feel that I can onboard quickly. I feel that I am comfortable working on cpp. I am really new to this algorithm but getting up to speed. I am going through this documentation here:https://www.pinecone.io/learn/hnsw/ feel free to point me to other resources. I hope this is not a dealbreaker!
Hi @noble-8 this is on our roadmap and please feel free to make a PR for https://github.com/milvus-io/knowhere . I suggest we can start from SQ8 which is easier to implement. And more than welcome to open another issue in Knowhere for further detailed communication.
/assign @liliu-z
Sounds good. Will do!
@noble-8 any progress on it?
i could not make any progress. i shall try again, however feel free to reassign this if i do not make a commit
i could not make any progress. i shall try again, however feel free to reassign this if i do not make a commit
Sure, still thanks for the interest! I would also like to help if you are intersted
I just saw that https://github.com/milvus-io/knowhere is now archived. Wondering if this issue is still open ?
Or should the PR be addressed to https://github.com/zilliztech/Knowhere instead from now on ?
To summarize to make sure that I understand this correctly:
Knowhere
already implements HNSW, which runs on top of high-dimensional vectorsAm I correct ?
It has been archived and moved to https://github.com/zilliztech/knowhere, sorry for the misunderstanding.
You are correct my man. we want to add quantization support for HNSW index and integrate with Milvus
Thanks. How urgent do you folks need this ? My C++ is rusty 😠so it may take a while ( I have CoPilot so that helps 😠).
But I love this challenge.
@xiaofan-luan if you folks have patience to spare, then assign this to me
Edit: I tried to hack around and it seems that it's a bit too much for me to take this time. I'll pick another good first issue
to ramp up.
@xiaofan-luan I think the issue is not easy for beginners, it needs lots of knowledge 😅
@xiaofan-luan I think the issue is not easy for beginners, it needs lots of knowledge 😅
Agreed you might be correct.
For SQ might be ok?
But true it has to be fully understand milvus
remove the good first issue
I sound that HNSW-SQ8 has been available on Ziili Cloud. Is that true?
Zilliz cloud don't use HSNW. we have an internal index named Cardinal~
whether milvus can use hnsw pq index now ? @xiaofan-luan
/assign @liliu-z
@liliu-z do we have plan to support hnsw pq and sq index?
@xiaofan-luan Can you provide some guidance on where modifications are needed to support HNSW PQ index??
NP, I thought Li @liliu-z can help on that.
@liliu-z @xiaofan-luan emmm.... where is liliu-z
@liliu-z @xiaofan-luan emmm.... where is liliu-z
Sure, there are two ways to support HNSW + Quantization:
We are adopting the first way. So the work including:
Here is an example PR for the first step. It support SQ8 for HNSW in Knowhere side.
@liliu-z it seems HNSW_PQ is not using faiss but using hnswlib after quantization...
Now we prefer to use hnswlib rather than faiss for hnsw, so we need to backport pq and sq feature
@liliu-z @xiaofan-luan I find it is too hard for me to support pq feature for the backport to hnswlib but the hnsw pq is very beneficial and important for me so that I hope milvus can support hnsw pq index as soon as possible..
@xiaofan-luan I think maybe it can temporarily support hnsw pq index by faiss
@xiaofan-luan I think maybe it can temporarily support hnsw pq index by faiss
Yes, we are discussing with faiss team about the possibility to switch to faiss' HNSW. There are still some gaps like performance, features and APIs. And we will support pq in hnswlib if we finally decide not to go with faiss.
Will keep this post updated if any progress.
@liliu-z I get it and want to know when can we use hnsw pq index by milvus
@alexanderguzhva please help on this. we want make sure faiss HNSW has similar performance and exact same functionality with current knowhere implementation
@xiaofan-luan @liliu-z Since knowhere already supports hnsw_sq index, when will milvus support it..
@xiaofan-luan I'm in the middle of deprecating hwnslib in favor of faiss already, work has been in the progress for some time
@xiaofan-luan Actually, I want to know when will milvus support hnsw_sq index, since knowhere already supports hnsw_sq index..
@xiaofan-luan Actually, I want to know when will milvus support hnsw_sq index, since knowhere already supports hnsw_sq index..
It has not been fully tested yet, we can try to support it out as a beta function in the next release, which is 1-2 weeks from now. What do you think @xiaofan-luan
@liliu-z please make sure knowhere side is ready
@tedxu could you assign someone to support HSNW PQ/SQ and did some test
Acutally, offline data pipeline can load index fasterly and avoid train index process. Is there some solution for me to train index offlinely and load index at container(stand-alone milvus) startup. @xiaofan-luan
this might be our goal to do so.
using milvus with more offline index node should help.
if you already index by your self, why not simply serve it with faiss or hnsw?
Is there an existing issue for this?
Is your feature request related to a problem? Please describe.
SQ8 and PQ are widely used in ANN search. If you want to understand more about quantization, Faiss is probably one of the best code bases to explore.
HNSW is the fastest index in the open source world, so why not make it work together with SQ and PQ to accelerate it further?
Let me know if anyone is interested and we can offer more help on it
Describe the solution you'd like.
No response
Describe an alternate solution.
No response
Anything else? (Additional Context)
No response