Closed pedrohenriqp closed 2 years ago
Yes @pedrohenriqp , on colab gpu for just running diarization = pipeline(audio.wav)
it takes more than 9 minutes.
Please let me know if u come across any solution.
Because this repo is written by a group of noobies programmers (just elementary AI students). There is tons of useless polymorphism implementations, complicated class inheritance that makes debugging so hard.
Any way I spend a week and Refactored this repo and made a "ahead of real time" version that runs fully on gpu.
I actually wonder how they call the module "Pipeline" while it's neither modular or realtime. most of the code is written in numpy, and is a good case of how to use speech brain in your project as an example and nothing more.
@WilledgeR, can your "ahead of real time" version be found somewhere? I run also into trouble, that the speaker diarization does not use the available gpu ressources.
best,
Lutharsanen
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi guys, I have a question about inference time using pipeline for Speaker Diarization. I I trained a model with my own data, and I am using pipeline to make predictions in a new .wav file. A 12 minute long conversation audio is taking about 14 minutes for inference.
Is that right? If yes, do you guys have any tips to speed up the pipeline inference time?
I am using this kind of code snippet to run inference. Is there a better way to run inferences on many files without needing to use for loops?
Thanks