Blingfire in longrunning process

ELind77 commented 3 years ago

Hi all,

I have a long-running process (a web service) that I'd like to use blingfire in. In all of the code examples I have seen, the model is freed after it's called and that makes me a little nervous that there might be some particular reason that the model is freed since that's pretty unusual to do in Python. Usually, once something passes out of scope, and its refcount drops to zero, it's freed by the garbage collector, so freeing an object manually is a little odd (thought I don't really work with ctypes much so maybe that's just how it works when you load a DLL).

Is it ok for me to just never free the model? Will I get memory leaks?

SergeiAlonichau commented 3 years ago

The current Python interface is much like a set of static functions (without state), you certainly don't need to / should not load and free models on every call. A model / models suppose to be loaded after your long running process is created and then freed before your process is terminated. As your process is terminated all memory will be freed, so you don't have to call free at the very end of life of your process... but it is a right thing to do for the code. If you keep loading models and not freeing them then more and more memory will be used (it might be loaded with memory mapping but still) you should not do it. The same model can be shared by multiple threads so loading it twice does not make sense.

You can create your own higher level API's based on classes and free the model in the destructor.

ELind77 commented 3 years ago

Ok, that's great then. I can handle the tokenizer model the same way I handle ML models.

I didn't realize they were memory mapped either, that means I get some benefit from the shared page cache across processes as well.

Thank you very much for answering my question.

microsoft / BlingFire

Blingfire in longrunning process #130