linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
GNU Affero General Public License v3.0
2.01k stars 156 forks source link

End time of one word is the start time of the next one #71

Closed konradipipan closed 1 year ago

konradipipan commented 1 year ago

From what i can see, every word's start time is the end time of the preceding one. Is there any way to get more precise timestamps, so that the values differ from one another?

Jeronymous commented 1 year ago

Currently the word end time can be different from the next word start time when:

You can try option detect_disfluencies that might also add some gap between words. This option is described in the README