Closed Gan-Xing closed 1 week ago
I have also tested with French audio, and the issue persists. Here is the curl
command and the result:
curl -X POST "http://localhost:8000/v1/audio/transcriptions" \
-F "file=@/mnt/raid1/backup/boncourage.mp3" \
-F "model=Systran/faster-distil-whisper-large-v3" \
-F "language=fr"
The response:
{"text":"You've got the bantraille. All, my frere."}
The French audio is also being translated to English instead of being transcribed. boncourage.mp3.zip
I found a solution to the problem. The model Systran/faster-distil-whisper-large-v3
does not support Chinese and French, it only supports English. Here is the successful transcription using a model that supports Chinese:
curl -X POST "http://172.16.2.68:8000/v1/audio/transcriptions" \
-F "file=@test.mp3" \
-F "model=Systran/faster-whisper-large-v2" \
-F "language=zh" \
-F "response_format=json" \
-F "temperature=0"
The response:
{"text":"这是一段录音测试用来进行语音转文字的测试"}
You can find available models that support different languages at Systran on Hugging Face.
Thanks for such a detailed issue. Like you had already discovered, distil models only support English.
It looks like the supported languages can be found in the README.md of the models. What I'll end up doing here is adding a check on the transcription route that ensures that the model supports the requested language, if it doesn't 4xx
will be returned. This will provide an immediate feedback to users, letting them know if what they are trying to do is not possible.
Again, greatly appreciate you putting the time to create and follow up on the issue. If you have any feature requests please LMK
来信已收到。谢谢。——此为自动回复。Votre courrier est bien re?0?4u,merci.//C'est une réponse automatique.Your e-mail has been received,thanks.//This is an automatic reply.
Description
I am using
fedirz/faster-whisper-server
for transcribing Chinese audio files, but the output is incorrectly translated to English. I only need transcription, not translation.Environment
fedirz/faster-whisper-server:latest-cuda
Logs
Current Behavior
The transcription output is being translated to English, despite the audio being in Chinese. Here is the
curl
command and the output:Expected Behavior
The transcription output should be in Chinese, as the input audio is in Chinese, and I have specified the language as
zh
.Steps to Reproduce
curl
command to send a Chinese audio file (test.mp3
) for transcription.Additional Context
I am using a Chinese audio file that I recorded myself. I only need transcription, not translation.
Request
How can I ensure that the server only transcribes the audio and does not translate it? Is there any additional configuration or parameter that I need to set?
Thank you for your assistance! test.mp3.zip