JSchmie / ScrAIbe

Tool for automatic transcription and speaker diarization based on whisper and pyannote.
https://jschmie.github.io/ScrAIbe/
GNU General Public License v3.0
25 stars 5 forks source link

Speaker Recognition #108

Closed DrRaja closed 1 month ago

DrRaja commented 3 months ago

Thank you for a great package! I am wondering if you plan to support speaker recognition? Given a folder with voice samples for speakers, it assigns each speaker a name rather than a placeholder.

Thanks

JSchmie commented 2 months ago

Hello @DrRaja thanks for the suggestion. :blush:

Currently, there are no plans to include this feature because it can be quite challenging to implement robustly, depending on the use case. However, here is an example of how to handle a simple scenario where the speakers introduce themselves at the beginning:


from scraibe import Transcript
import spacy

nlp = spacy.load("en_core_web_sm")

transcript = Transcript.from_json('example.json') # output of your Transcription

transcript_dict = transcript.transcript
speaker = {}

for i in range(len(transcript_dict)): 

    _trans = transcript_dict[str(i)]
    _speaker = _trans['speakers']
    _text = _trans['text']
    ner = nlp(_text)

    for word in ner.ents:
        if word.label_ == 'PERSON':
            if not speaker.get(_speaker): # remove double stacking naming
                speaker[_speaker] = word.text

transcript.annotate(**speaker)

print(transcript)

Hope this helps! 😊