Memory leakage - Githubissues

McCloudS / subgen

Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr

MIT License

453 stars 45 forks source link

Memory leakage #84

Closed xhzhu0628 closed 3 months ago

xhzhu0628 commented 3 months ago

Haha, it's still me. I deployed the service on Unraid and replaced the graphics card with a Tesla P4. The program successfully started running!

However, it seems that there is a memory leak issue in the program. When I add multiple tasks to the task queue at once, it consumes a large amount of memory and crashes the service. When I use "--memory=5G" in docker, a single file can run (but I can observe a significant peak in system memory usage), but when I run multiple files, the log shows multiple "Model was purged, need to re-create" messages, followed by an error saying "died with <Signals,SIGKILL.9>" and stops.

McCloudS commented 3 months ago

Yeah, CONCURRENT_TRANSCRIPTIONS doesn't really do anything right now. I need to fix queuing and threading. I've had better luck with WHISPER_THREADS.

I've never had a leak or an OOM issues with the container on the P4. The only time i've had it crash was trying to use a Large model with >60 min files. I've easily had 100+ items in the 'queue' (use it loosely because it doesn't quite work) with 5+ items transcribing at the same time without any issues.

Are you somehow memory constrained on your machine? I'm running 32gb have have run the the exact container configuration you have for months without issue.

xhzhu0628 commented 3 months ago

I'm currently running Unraid with 16GB of memory, but only 7GB is available. Using a medium model with >120 min files.

McCloudS commented 3 months ago

I'm not sure. This is what I see with about a dozen Mediums queued up in Bazarr.

xhzhu0628 commented 3 months ago

Finally, I found that there was a peak in RAM usage when loading large media files, even during GPU transcription. I solved this by enabling Swap Memory in Unraid.

By the way, I fix queuing by using threading.Thread() and create a PR