How to change the language recognition of Deepgram API? I want him to recognize it as Chinese instead of English. I tried to modify the language in DeepgramSTTModel in the transfer_models.py file, but still can only recognize English

vivekuppal / transcribe

Transcribe is a real time transcription, conversation, Language learning platform. It provides live transcripts from microphone and speaker. It generates a suggested conversation response using OpenAI's GPT API. It will read out the responses, simulating a real live conversation in English or another language.

MIT License

194 stars 46 forks source link

def __init__(self, stt_model_config: dict): # Check for api_key if stt_model_config["api_key"] is None: raise Exception("Attempt to create Deepgram STT Model without an api key.") # pylint: disable=W0719 # self.lang = 'en-US' self.lang = 'zh-CN' print('[INFO] Using Deepgram API for transcription.') self.audio_model = DeepgramClient(stt_model_config["api_key"])

The configuration is not clear from the issue description. Are you using command line parameters or override.yaml to use deepgram.

The observation is correct that deepgram is unable to recognize any other languages besides english.

I believe the following change will resolve the issue

Add the line detect_language=True

here https://github.com/vivekuppal/transcribe/blob/f25f0874eaa298079e2bff0fd2e58ddec389cc08/sdk/transcriber_models.py#L311

The method will look like this with the additional option of detecting the language.

    def get_transcription(self, wav_file_path: str):
        """Get text using STT
        """
        try:
            with open(wav_file_path, "rb") as audio_file:
                buffer_data = audio_file.read()

            payload: FileSource = {
                "buffer": buffer_data
                }

            options = PrerecordedOptions(
                model="nova",
                smart_format=True,
                utterances=True,
                punctuate=True,
                paragraphs=True,
                detect_language=True)

            response = self.audio_model.listen.prerecorded.v("1").transcribe_file(payload, options)
            # This is not necessary and just a debugging aid
            with open('logs/deep.json', mode='a', encoding='utf-8') as deep_log:
                deep_log.write(response.to_json(indent=4))

            return response
        except Exception as exception:
            print(exception)

        return None

This should resolve the issue.

vivekuppal / transcribe

How to change the language recognition of Deepgram API? I want him to recognize it as Chinese instead of English. I tried to modify the language in DeepgramSTTModel in the transfer_models.py file, but still can only recognize English #189