Open alpcansoydas opened 7 months ago
Yes it can. On a RTX4080 I am able to get 4 parallel streams without issue. 4 streams is the default value for max_clients
in the TranscriptionServer
class. You can specify more or less for your particular application.
With 4 parallel streams I see minimal performance impact to my eye, but I have not benchmarked it.
One note is that: each stream loads into VRAM, so eventually you will be limited by VRAM.
For example, the server side is deployed. Can the server-side handle multiple parallel transcription requests? How many requests can it be handled? Will there be any performance issues? It may be a basic question, but I wanna know about it. Thanks:)