Open picheny-nyu opened 6 months ago
Thank you for your issue.You might want to check the FAQ if you haven't done so already.
Feel free to close this issue if you found an answer in the FAQ.
If your issue is a feature request, please read this first and update your request accordingly, if needed.
If your issue is a bug report, please provide a minimum reproducible example as a link to a self-contained Google Colab notebook containing everthing needed to reproduce the bug:
Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).
Companies relying on pyannote.audio
in production may contact me via email regarding:
This is an automated reply, generated by FAQtory
Not in 3.x, no.
I am considering adding back the option but cannot provide an ETA though.
Can you say more about your use case?
Using the output to identify sections of speech in parent-toddler conversations to transcribe as input for unsupervised speech recognition fine-tuning. Figure better to miss questionable segments than train on false alarms.
I would then use pyannote/segmentation
for this purpose, wrapped in a voice activity detection pipeline that comes with onse/offset thresholds:
https://huggingface.co/pyannote/segmentation#voice-activity-detection
Thanks. I do need diarization, though - I want to process the adult and toddler speech separately. Would you suggest I just use a downleveled version of the diarization pipeline that still uses VAD?
Yes.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I have a diarization application in which I prefer to have fewer false alarms at the expense of more misses. Can this be controlled during fine tuning?
Thanks Michael