Open HyunjunA opened 9 months ago
Do you have any instructions or example code for how to run the model directly in HF transformers? Transformers.js aims to be a JS port of the python library, and may not be suitable for custom use-cases and libraries like this (pyannote
).
However, if it can be run in transformers, then it's a good candidate for adding support here too! 🤗
It seems that recently speaker turn detection was added into whisper.cpp; You can find all information in this repository: https://github.com/akashmjn/tinydiarize
The PRs that add support to whisper.cpp appear very small and the new models are 100% compatible as they are finetuned versions that don't use new tokens
Name of the feature In general, the feature you want added should be supported by HuggingFace's transformers library:
Model: pyannote/speaker-diarization https://huggingface.co/pyannote/speaker-diarization
Reason for request Why is it important that we add this feature? What is your intended use case? Remember, we are more likely to add support for models/pipelines/tasks that are popular (e.g., many downloads), or contain functionality that does not exist (e.g., new input type).
Incorporating a speaker diarization model into web apps will enable us to offer advanced audio analysis features like speaker-change-detection, voice-activity-detection, and overlapped-speech-detection. This will set our app apart in a crowded market.
These features not only add to the functionality but also significantly improve the user engagement by providing a more interactive and insightful experience.
Additional context Add any other context or screenshots about the feature request here.