Split the embedding model into stand alone server

Drzhivago264 commented 3 months ago

Hello, Currently whenever the django server is loaded django-vectordb will load the embedding model into memory. Therefore, if I spawn multiple django processes, the model will be duplicated across multiple processes (using Daphne) which wasting a lot of ram.

For now, it is possible to run django-vectordb in a stand alone django server to handle request as independent app. However, this creates a problem that the table that need to be embedded need to be defined in the django-vectordb app, this is a pain for maintenance and update. In addition, the main server will need to send request for simple query that table even without using vector search.

It is really nice if I can remove the duplicated model instance when running multiple processes.

pkavumba commented 3 months ago

Thanks for reporting the issue. If you already have a fix for this issue, I would be happy to review the PR. If not, I will work on a fix and include a flag to opt out of loading model weights at app initialization for backward compatibility

Drzhivago264 commented 3 months ago

I am reading around but it is not easy share memory across multiple django processes. I think what we can do is to put the sentence transformer class in something like redis, but I am not quite sure whether we can put the Pytorch models in redis.

pkavumba / django-vectordb

Split the embedding model into stand alone server #26