Open Drzhivago264 opened 3 months ago
Thanks for reporting the issue. If you already have a fix for this issue, I would be happy to review the PR. If not, I will work on a fix and include a flag to opt out of loading model weights at app initialization for backward compatibility
I am reading around but it is not easy share memory across multiple django processes. I think what we can do is to put the sentence transformer class in something like redis, but I am not quite sure whether we can put the Pytorch models in redis.
Hello, Currently whenever the django server is loaded django-vectordb will load the embedding model into memory. Therefore, if I spawn multiple django processes, the model will be duplicated across multiple processes (using Daphne) which wasting a lot of ram.
For now, it is possible to run django-vectordb in a stand alone django server to handle request as independent app. However, this creates a problem that the table that need to be embedded need to be defined in the django-vectordb app, this is a pain for maintenance and update. In addition, the main server will need to send request for simple query that table even without using vector search.
It is really nice if I can remove the duplicated model instance when running multiple processes.