danijel3 / KaldiWebrtcServer

Python server for communicating with Kaldi from the browser using WebRTC
Apache License 2.0
67 stars 37 forks source link

gpu decoding #2

Open lalimili6 opened 5 years ago

lalimili6 commented 5 years ago

Hi Now Kaldi support GPU decoding (https://github.com/kaldi-asr/kaldi/pull/3114) is it possible to configure TCP port on GPU? What your opinion? How can change TCP port to use GPU decoding? Best regards

danijel3 commented 4 years ago

Sorry for the late response. I was on vacation and then forgot about this message.

Personally, I don't see a reason to do the realtime decoding on the GPU. I mean if you were decoding offline data, then you can possibly get a speedup by processing a file faster than realtime. WIth online recognition (eg. from the microphone) you can't really go faster than realtime so the GPU would be severely underutilized. In fact, you can easily cram more than one conversation on a single CPU core and it will work fine.

Am I missing anything?

lalimili6 commented 4 years ago

My idea that GPU has more computation capacity and can decoding more waves in realtime. TCP decoding can decode only one wave on one TCP port. A server has about 6000 TCP ports it means TCP decoding can decode 6000 query. I guess that GPU decoding probably decoding more queries.

best regards

danijel3 commented 4 years ago

That's an interesting hypothesis, but I'm not sure if we're there yet. The current cudadecoder exists only in the "batched" version. I don't think anyone tried to make one that would process many realtime streams on a single GPU.

As far as the port limitation, I think 6000 threads in parallel is plenty for one server, or am I mistaken?

dpny518 commented 4 years ago

I don't think 6000 concurrent is possible, it is usually limited by cpu core, so 4 cores means 4 concurrent. A gpu can increase concurrency, but there is only a batched version for now this pull request will be a game changer

https://github.com/kaldi-asr/kaldi/pull/3568