Closed zhy844694805 closed 2 weeks ago
This sounds very interesting. May I know your current LLM setup? We should investigate. I'm curious what is the best TTS solution for Chinese?
The best Chinese TTS: ChatTTS. https://github.com/lenML/ChatTTS-Forge API supports OpenAI API-compatible formats. The LLM used is the locally trained Qwen1.5-14b. Regarding the prompt part, I haven't changed anything.
Alright, let's get back to the original topic. I use the locally deployed whisper-v3, and after integrating it with Amica, the transcription always comes out in English. I have tested it and found that only the single-language version, whisper-v2-zh, can transcribe in Chinese. So I am now quite sure that the transcription language settings in Amica for the STT part automatically translate to English.
I'll try to identify, and think of some automatic way of detecting this and fixing it.
我会尝试识别并想出一些自动的方法来检测和修复它。
https://github.com/semperai/amica/blob/master/src/features/openaiWhisper/openaiWhisper.ts // Request body const formData = new FormData(); formData.append('file', file); formData.append('model', config('openai_whisper_model')); formData.append('language', 'en'); if (prompt) { formData.append('prompt', prompt); }
According to the code, the input audio file will indeed be transcribed into English. The key lies in the line formData.append('language', 'en')
, which explicitly specifies English ('en') as the target language for transcription. Therefore, regardless of the original language of the audio file, the OpenAI Whisper API will attempt to transcribe it into English text.
I have added my locally deployed faster-large-v3 STTcurl, and it can transcribe Chinese. However, when I call the API to integrate it into the Amica project, it directly transcribes the input Chinese into English. I would like to know where I can disable this setting that transcribes into English.