Open AGulshan opened 2 weeks ago
Hi @AGulshan, thanks for raising the request! We will plan this improvement. To help me better assess its priority, could you please share a bit more details on how sparse vector feature is used in your flow with milvus kafka connector? For example, is this for hybrid search with bm25 or splade?
@nianliuu can you help take a look at this feature request?
Sure, I will take a look and make a release soon. Thank you! @AGulshan
Hi, @codingjaguar, @nianliuu!
Thank you for your prompt response!
We are currently experimenting with both dense and sparse embeddings using the BGE-M3 algorithm to optimize our search capabilities. We are also planning to explore other algorithms that leverage sparse vector capabilities for hybrid search scenarios, possibly including techniques like bm25 or splade in the future.
Integrating the kafka-connect-milvus with support for SparseVector is essential for us to build and optimize our data pipelines effectively and would significantly enhance our flexibility and performance in managing complex embeddings.
Your consideration of this upgrade is much appreciated!
Best regards, Gulshan
Thanks for the context! We are also launching more support for hybrid search in Milvus 2.5 (will release by mid Nov), such as support of Elasticsearch equivalent full-text search capability. Please feel free to email me for any questions you have during the experiment. My email is jiang.chen@zilliz.com
Hi,
I'm writing to request an upgrade to the milvus-sdk-java to version 2.4.2 in the kafka-connect-milvus project. The newer version supports SparseVector, which is essential for efficiently managing sparse data.
Advantages of Upgrading: SparseVector Support: Key for handling sparse, high-dimensional data. Improved Features and Performance: New SDK versions typically offer enhanced performance, more features, and fixes.
Suggested Changes: Update the SDK version in pom.xml to 2.4.2. Test to ensure the new SDK version works well with the project.
Upgrading can significantly improve the project's capability in processing sparse data types. I look forward to your thoughts and am ready to help with updating and testing the new version.
Thanks! Gulshan