Support for speaker diarization (recognizing different voices), Google

BingLingGroup / autosub

Command-line utility to transcribe/translate from video/audio/subtitles to subtitles

GNU General Public License v2.0

1.98k stars 244 forks source link

Support for speaker diarization (recognizing different voices), Google #125

Closed webberian closed 4 years ago

webberian commented 4 years ago

I could not find this in your documentation. Does autosub have support for speaker diarization for Google Speech V2? More information https://cloud.google.com/speech-to-text/docs/multiple-voices#speech-diarization-python

(This idea taken from issue https://github.com/agermanidis/autosub/issues/113)

BingLingGroup commented 4 years ago

Google Speech V2 doesn't support json config request. I have tried it. It just a free API with no documents available. Instead of an issue in autosub, it's just an API problem. BTW, you can try to modify Google Cloud Speech-to-Text json request config to get the speaker diarization result.