pkavumba / django-vectordb

A fast and scalable app that adds vector search capabilities to your Django applications. It offers low latency, fast search results, native Django integration, and automatic syncing between your models and the vector index. Incremental updates are also supported out of the box.
Apache License 2.0
68 stars 6 forks source link

Split the embedding model into stand alone server #26

Open Drzhivago264 opened 3 months ago

Drzhivago264 commented 3 months ago

Hello, Currently whenever the django server is loaded django-vectordb will load the embedding model into memory. Therefore, if I spawn multiple django processes, the model will be duplicated across multiple processes (using Daphne) which wasting a lot of ram.

For now, it is possible to run django-vectordb in a stand alone django server to handle request as independent app. However, this creates a problem that the table that need to be embedded need to be defined in the django-vectordb app, this is a pain for maintenance and update. In addition, the main server will need to send request for simple query that table even without using vector search.

It is really nice if I can remove the duplicated model instance when running multiple processes.

pkavumba commented 3 months ago

Thanks for reporting the issue. If you already have a fix for this issue, I would be happy to review the PR. If not, I will work on a fix and include a flag to opt out of loading model weights at app initialization for backward compatibility

Drzhivago264 commented 3 months ago

I am reading around but it is not easy share memory across multiple django processes. I think what we can do is to put the sentence transformer class in something like redis, but I am not quite sure whether we can put the Pytorch models in redis.