Limit transcription CPU threads

Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser

https://joinpeertube.org/

GNU Affero General Public License v3.0

12.89k stars 1.48k forks source link

Limit transcription CPU threads #6467

Open JohnXLivingston opened 2 months ago

JohnXLivingston commented 2 months ago

Describe the problem to be solved

v6.2.0-RC1 comes with an incredible feature: automatic subtitles generation.

But the models that are used can use a lot of CPU and RAM. I did not see any option to limit their usage (as we can with video transcoding).

Describe the solution you would like

Would it be possible to add some options?

Chocobozzz commented 1 month ago

I don't think we can limit RAM but CPU yes there is a threads option

  --threads THREADS     number of threads used for CPU inference (default: 0)

lutangar commented 1 month ago

RAM usage largely depends on model size since it must be loaded to RAM (multiplied by the number of runners since models aren't shared in memory).

For example with whisper-ctranslate2 (which uses faster-whisper which is CPU friendly) the models tend to be larger than the one provided for openai-whisper :

large-v3 ~3GB
medium ~1.5GB
tiny ~75MB

Of course, transcript quality will get worse the further you decrease the size.

https://huggingface.co/Systran

JohnXLivingston commented 1 month ago

Thanks for the information @lutangar ! Good to know.