SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
11.39k stars 951 forks source link

When testing Chinese, there are no punctuation marks in the results! #662

Open Yaodada12 opened 7 months ago

Yaodada12 commented 7 months ago

I use both faster-whisper-v2 and faster-whisper-v3.

from faster_whisper import WhisperModel

model = WhisperModel("large-v3")

segments, info = model.transcribe("zh_audio.mp3")
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
trungkienbkhn commented 7 months ago

@Yaodada12 , hello. From my test, large-v3 gave poor quality and no punctuation. But large-v2 gave quite good quality. Then I tried to add option condition_on_previous_text=False with large-v3 model and I found that the quality has improved a lot. Can you try again with this option ? My code logic:

model = WhisperModel('large-v3', device='cuda')
segments, info = model.transcribe('zh.m4a', word_timestamps=True, condition_on_previous_text=False)
Yaodada12 commented 7 months ago

@Yaodada12 , hello. From my test, large-v3 gave poor quality and no punctuation. But large-v2 gave quite good quality. Then I tried to add option condition_on_previous_text=False with large-v3 model and I found that the quality has improved a lot. Can you try again with this option ? My code logic:

model = WhisperModel('large-v3', device='cuda')
segments, info = model.transcribe('zh.m4a', word_timestamps=True, condition_on_previous_text=False)

Thanks,i will try.

hscspring commented 5 months ago

same issue, i use large-v2 for ZH.

mru4913 commented 4 months ago

@hscspring I cannot even transcribe 'zh'