Open amitkayal opened 1 year ago
Hello,
At the moment, we do not provide support for commercial Vector DBs. However, you can modify the db.py file to customize it according to your requirements.
Currently, our platform only supports the all-mpnet and CLIP models, as they are state-of-the-art embedding models. We are actively working towards integrating more diverse and flexible models.
We do have a rudimentary duplicate file detection mechanism in place. You can refer to the implementation in anything.py, specifically line 88, for more information.
Contributions to augment our model collection / database would be greatly appreciated!
@amitkayal Any updates? Have you tried with Pinecone? What's your use case?
Hi, I would like to use commercial Vector DB like Pinecone for storing the vector and also wants choose model for embedding generation. Does it allow me such flexibility? I also wanted to know if we can have file level duplicate check to ensure same file does not get processed multuple times. Thanks