SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
10.31k stars 865 forks source link

I can't use language='zh' when I use large-v3 #778

Open wntg opened 3 months ago

wntg commented 3 months ago

warning: The current model is English-only but the language parameter is set to 'zh'; using 'en' instead.

trungkienbkhn commented 3 months ago

@wntg, hello. It seems you are using an English-only model like tiny.en, small.en, etc. Could you try again with other multilingual models ? Ex:

from faster_whisper import WhisperModel

model = WhisperModel("large-v3", device="cuda")
segments, info = model.transcribe("audio.mp3", language="zh")
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
wntg commented 3 months ago

@wntg, hello. It seems you are using an English-only model like tiny.en, small.en, etc. Could you try again with other multilingual models ? Ex:

from faster_whisper import WhisperModel

model = WhisperModel("large-v3", device="cuda")
segments, info = model.transcribe("audio.mp3", language="zh")
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

I used large-v3: model_size = "large-v3" model = WhisperModel(model_size, device="cuda", compute_type="float16")

trungkienbkhn commented 3 months ago

It's a bit weird because "large-v3" is a multilingual model. This warning appears only with condition from here. Could you show full code and attach your audio ?