linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
GNU Affero General Public License v3.0
1.87k stars 150 forks source link

Feature Request: option to Allign the words with the vowel of the first syllable rather than the first consonant. #103

Open JeromeNelsonC opened 1 year ago

JeromeNelsonC commented 1 year ago

You guys are making great progress, but I have a suggestion to make this thing really good. Well the title says it all. Can we have an option to allign the word with the vowel of the first syllable rather that the space before the word or the consonant? Because psycologically we start making sense of the word when the first syllable is emitted because a syllable is alway made up of a consonant and a vowel in the minimum. If it is done this way, I think the subtitles will be more impacting.