collabora / WhisperLive

A nearly-live implementation of OpenAI's Whisper.
MIT License
2.03k stars 277 forks source link

Parallel request to the server #188

Open alpcansoydas opened 7 months ago

alpcansoydas commented 7 months ago

For example, the server side is deployed. Can the server-side handle multiple parallel transcription requests? How many requests can it be handled? Will there be any performance issues? It may be a basic question, but I wanna know about it. Thanks:)

cjpais commented 7 months ago

Yes it can. On a RTX4080 I am able to get 4 parallel streams without issue. 4 streams is the default value for max_clients in the TranscriptionServer class. You can specify more or less for your particular application.

With 4 parallel streams I see minimal performance impact to my eye, but I have not benchmarked it.

One note is that: each stream loads into VRAM, so eventually you will be limited by VRAM.