Open ExtReMLapin opened 8 months ago
Is that model working with vanilla Whisper and word_timestamps=True?
yes
gave a try with
import whisper
# Load model
model = whisper.load_model("./models/whisper-large-v3-french-distil-dec16/original_model.pt")
# Transcribe
result = model.transcribe("./tmp0p6z2kmk_short.wav", language="fr", word_timestamps=True)
print(result)
@ExtReMLapin , hello. I encoutered same error with you when running with word_timestamps=True
. You can see this comment.
My error came from the alignment_heads field in the model's config.json file. Can you re-check this file in your finetuned model ?
For exact error, you can check this comment.
Besides, you should add condition_on_previous_text=False
to improve the transcription quality.
Hope it's helpful to you.
Hi @ExtReMLapin ,
Just fixed the issue here, thanks to @Jeronymous!
By the way, for this version, it's true that condition_on_previous_text=False
will yield better performance for long-form sequential decoding, as pointed out by @trungkienbkhn.
Hello, When using this finetuned version of distil whisper and trying to use
word_timestamps=True
it crashes when starting the transcription, no issue whenword_timestamps=False
It's a CRASH, not a python error, it straight exits the python instance, no crash message, nothing, just byebye amigo hasta la vista