alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.57k stars 1.06k forks source link

Vosk has big model memory leak #1351

Closed philipag closed 1 year ago

philipag commented 1 year ago

The following code leaks most of the memory allocated:

var model = new Vosk.Model("..... vosk-model-en-us-0.22-lgraph"); model.Dispose();

On an Android phone the model allocates about 350MB and the dispose only frees about 100MB of that so there is a 250MB leak!

I am using the following adb cmd batch to measure memory usage and monitoring "TOTAL PSS": for /l %x in (1, 1, 1000000) do (adb shell dumpsys meminfo com.soundjar;sleep 1)

nshmyrev commented 1 year ago

dispose doesn't release memory to the OS, it only releases memory within the process.

philipag commented 1 year ago

So these are mallocs that sit in the C runtime heap and are never returned to the OS? Maybe such large buffers should be allocated differently in order to avoid permanently growing the C heap? On a small model this is "only" a few hundred megs but on a large model this is a lot of memory.

philipag commented 1 year ago

It seems that after some time more memory is returned to the OS. So perhaps the C heap implementation does eventually return some unused blocks to the OS? Don't know what heap implementation Vosk for Android is linked with but it seems it's capable of returning memory back to the OS. Perhaps Vosk can just tell it to do so after disposing of a model?

This link might be useful depending on the malloc implementation being used:

https://stackoverflow.com/questions/2215259/will-malloc-implementations-return-free-ed-memory-back-to-the-system

nshmyrev commented 1 year ago

We use standard OS malloc. We might probably link to tcmalloc sometimes, but in general it is a standard OS behavior.

philipag commented 1 year ago

On Android 11+ this means the scudo implementation is used which does indeed by default return memory to the OS after some time. We can probably consider this issue closed although it would be nice if e.g. on Windows a separate heap were allocated for the Model so that if could completely be returned to the OS upon dispose(). Something similar could be done on Linux, Android and iOS but maybe that's asking too much...

nshmyrev commented 1 year ago

We'd better focus on much more compact models. It is possible to have very good accuracy with just 40Mb in memory (like picovoice is doing).