alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
8k stars 1.11k forks source link

Vosk in .NET exhibits unusual memory collection which makes server temporarily unresponsive. #998

Closed phillipsawyer closed 2 years ago

phillipsawyer commented 2 years ago

Hello,

During production pilot tests using the Vosk .NET library over the last two weeks, there has been unusual behavior when it comes to how Vosk does memory collection. When Vosk seemingly collects memory, it causes a 4 core server to become unresponsive for around 20 seconds (labeled in orange), unable to process requests.

The code: 1) Initialize 20x VoskRecognizer using vosk-model-small-en-us-0.15 2) Stream audio using AcceptWaveform() 200ms at a time, generally with a concurrency of between 0-5. 3) After every stream (between 10-120s), Reset() the VoskRecognizer and return back to a recognizer pool to be used again. (Never initializes new recognizers)

image

Expected behavior: Less aggressive memory collection which does not cause the server to become unresponsive. Similar to this which is the result of a 40-min stress test on my Windows development workstation: image (Constantly running between 0-30 concurrent streams)

Vosk Nuget package v0.3.38 .NET 6 Running on Ubuntu 20.04 (LTS) x64 with a 4vcpu-8GB instance size. Same behavior (frequency and length of unresponsiveness) on a 2vcpu-4GB instance. The model is extremely quick in both cases.

Changing .NET Garbage collection mode from interactive to SustainedLowLatency had no noticeable effect on behavior. GC mode is ServerGC.

If this leads to a dead end, we can try rewriting to use vosk-server.

Thank you for your helpful generosity and for your fantastic project.

nshmyrev commented 2 years ago

Hello

You can try to run the same with tcmalloc, it should be better. Something like

LD_PRELOAD="/usr/lib/libtcmalloc.so" python3 asr_server.py

phillipsawyer commented 2 years ago

Thank you for your response.

sudo apt-get -y install google-perftools
sudo ln -s /usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4 /usr/lib/libtcmalloc_minimal.so
echo "LD_PRELOAD=/usr/lib/libtcmalloc_minimal.so" | tee -a /etc/environment

This is what I have done to evaluate.

nshmyrev commented 2 years ago

It could be just export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4

And, do you see the difference?

phillipsawyer commented 2 years ago

Unfortunately it does not seem to make a difference. It is very strange, since I'm not sure why a .NET server / Ubuntu would even become unresponsive for 20 seconds when a single library frees up its memory.

The frequency of freezes is proportional to the amount of usage of the Vosk Library.

The more Vosk is used, the more often the memory is cleared so the more often freezes occur. It is proportional to the amount used. Here is Vosk only being used 2% of the time, so it only froze once in two days. It is always unresponsive for around 20s:

image

Is this a known issue in Kaldi / Vosk?

phillipsawyer commented 2 years ago

In reality this is a duplicate of https://github.com/alphacep/vosk-api/issues/569

It was surprisingly regular and did not seem like a crash.