alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

Memory leak #126

Open dijiagui opened 6 years ago

dijiagui commented 6 years ago

Hello, I am doing a stress test on this project. I ran this project in docker, a master and 40 workers, and then sent the audio stream in real time, 30 concurrently. I counted the data for seven days and found that everything else was normal, but the memory usage grew by 1g per day.Most of the increase is in worker.py. I was trying to find out why, but I haven't solved this kind of problem before.

alumae commented 6 years ago

OK, thank your for reporting it.

antho-rousseau commented 6 years ago

Hello Tanel, any intel on that? I face the same issue.

alumae commented 6 years ago

Hmm, to be honest, I am surprised the memory leak is so small (1GB per day with 40 workers, for one day doesn't seem like very much). It's probably very difficult to track down. I recommend to just restart the workers periodically.

antho-rousseau commented 6 years ago

Actually it's much more than that in my experience, seems to be dependent on the number of files decoded. I faced a leak of around 50GB with 30 workers decoding about 35K files in a few days.

ericiper commented 5 years ago

Hi Alumae, I'm facing the same issue as antho-rousseau, It'd be nice if you could take a look ! E.