Immortalise / SearchAnything

A semantic local search engine powered by AI models.
MIT License
259 stars 21 forks source link

Does it support any commercial Vector DB? #3

Open amitkayal opened 1 year ago

amitkayal commented 1 year ago

Hi, I would like to use commercial Vector DB like Pinecone for storing the vector and also wants choose model for embedding generation. Does it allow me such flexibility? I also wanted to know if we can have file level duplicate check to ensure same file does not get processed multuple times. Thanks

Immortalise commented 1 year ago

Hello,

At the moment, we do not provide support for commercial Vector DBs. However, you can modify the db.py file to customize it according to your requirements.

Currently, our platform only supports the all-mpnet and CLIP models, as they are state-of-the-art embedding models. We are actively working towards integrating more diverse and flexible models.

We do have a rudimentary duplicate file detection mechanism in place. You can refer to the implementation in anything.py, specifically line 88, for more information.

Contributions to augment our model collection / database would be greatly appreciated!

davychxn commented 4 weeks ago

@amitkayal Any updates? Have you tried with Pinecone? What's your use case?