Open BjoernRave opened 7 months ago
Call whisper with return_timestamps: "word" Inspect output
Could you please provide a link to the audio file tested?
Hey @xenova Really big thanks for awesome project. I also have wrong timestamps issue. From my tests looks like stride param change fix it, but maybe it's deeper issue.
Whisper web with only
return_timestamps: "word",
Whisper web with word level and fixed valuestride_length_s=3
at worker.js - line 160
instead of
stride_length_s: 3, //isDistilWhisper ? 3 : 5,
Codesandbox link with changes that fix timestamp
Attaching audio file with which I have tested output.wav.zip
Thanks and have a great day!
System Info
"@xenova/transformers": "^2.14.0",
macbook with M2 chip and MacOs Sonoma
Node.js: 20.11.0
Environment/Platform
Description
I am running whisper like this:
However the returned word-level timestamps are all equal to the total duration of the audio file.
During the run my console also gets flooded with this kind of logs:
There is a releated PR in the python project: https://github.com/huggingface/transformers/pull/25607
Reproduction
return_timestamps: "word"