vivekuppal / transcribe

Transcribe is a real time transcription, conversation, Language learning platform. It provides live transcripts from microphone and speaker. It generates a suggested conversation response using OpenAI's GPT API. It will read out the responses, simulating a real live conversation in English or another language.
https://abhinavuppal1.github.io/
MIT License
179 stars 41 forks source link

[Feature] Diarization support? #139

Closed DawgZter closed 8 months ago

DawgZter commented 8 months ago

Could be interesting to include diarization support, so the LLM has the context of multiple speakers and which parts are spoken by who. I know deepgram already supports this in their transcription though, so maybe it's just a matter of changing up the settings on the api call and it's good to go? https://developers.deepgram.com/docs/diarization

vivekuppal commented 8 months ago

Getting diarization from Deepgram is fairly simple like you said, a matter of providing the correct config settings. The code would need to be adjusted to support from current You, Speaker, to You, Speaker-1, Speaker-2, ..... Speaker-N.

I would be happy to help anyone if they want to take this on.

vivekuppal commented 7 months ago

Deepgram has been upgraded to version 3.1.x in case it helps in anyway. Diarization is added to an example