jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.59k stars 177 forks source link

Using forced alignement #198

Closed AmirBraham closed 1 year ago

AmirBraham commented 1 year ago

Hello , If I have an original transcript ( txt file ) , is there anyway to improve the timestamps generated using forced alignement or any other technique ? I tried playing with the inital prompt parameter but I can see any real improvement. Thanks for any help !

jianfch commented 1 year ago

You can try to use whisper.timing.find_alignment().

AmirBraham commented 1 year ago

@jianfch do you have some example code ? I'm not sure which params work best for the tokenizer and text_tokens

jianfch commented 1 year ago

Try https://github.com/jianfch/stable-ts#alignment with the new update.

arvet333 commented 1 year ago

Is there an ability to use alignment via CLI? If so, where can I find the commands?

jianfch commented 1 year ago

Is there an ability to use alignment via CLI? If so, where can I find the commands?

It was added in c90ff06bc55694034994010e05b5fc2f50070b03.

stable-ts audio.mp3 --align text.txt --language en