In this PR, I implemented the same interface as the llm-chain-qdrant with some caveats :
All collection manipulations (ie creation, partitioning, index creation, loading etc...) is left to the user.
The add_text and add_documents methods return Vec, where string are UUIDs generated by the client in qrant. Milvus has an auto_id FieldSchema, we produce the id generated by Milvus. Arguably we should test that an id field exists :thinking: ?
qdrant offers a direct payload field, whereas Milvus does not, I implemented the metadata in insert + document content as a VARCHAR field column. The content and metadata are put in a HashMap and sent to Milvus as a string. VARCHAR field have a max_length limit of ~65000. I kept this not to break the API
search_similarity has the same issue with retrieving and parsing the metadata.
I think that rethinking the VectorStore trait would probably cleanup this implementation (and others ?) a bunch.
Solves #146 .
In this PR, I implemented the same interface as the
llm-chain-qdrant
with some caveats :add_text
andadd_documents
methods return Vecqrant
. Milvus has an auto_id FieldSchema, we produce the id generated by Milvus. Arguably we should test that an id field exists :thinking: ?qdrant
offers a direct payload field, whereasMilvus
does not, I implemented the metadata in insert + document content as a VARCHAR field column. The content and metadata are put in a HashMap and sent toMilvus
as a string. VARCHAR field have a max_length limit of ~65000. I kept this not to break the APIsearch_similarity
has the same issue with retrieving and parsing the metadata.I think that rethinking the VectorStore trait would probably cleanup this implementation (and others ?) a bunch.