Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
MIT License
423 stars 55 forks source link

[BUG] Subtitles lose sync #27

Closed k566o closed 8 months ago

k566o commented 1 year ago

I have compared your version to another whisper variant called StoryToolKitAI.

I think yours gets it right with breaking up the subtitles, while StoryToolKit has large paragraphs, but the problem with your version is that it loses sync, maybe due to prolonged loud noise with some speech (I have seen this phenomenon with other whisper versions)

In this example StoryToolkitAI is the large font, while Speech-Translate is small font. Go to around 2minutes to see where your version loses sync and starts adding subtitles before the words have been said while StoryToolKIt keeps sync. Translate option is being used for both, as well as Large dictionary V2. Russian is the language being translated https://www.youtube.com/watch?v=S8e80gE8YVk

Dadangdut33 commented 1 year ago

Hey thanks for letting me know, this is a known problem of whisper actually, there are some discussion about it:

https://github.com/openai/whisper/discussions/89 https://github.com/openai/whisper/discussions/435

Based on that discussions you can play around with the parameters of the model such as setting --condition_on_previous_text False

And someone seems to have made a library for that, i will add it to the code here if possible, thanks :)

k566o commented 1 year ago

Brilliant, can't wait!

Dadangdut33 commented 8 months ago

Fixed in 1.3.0 release with the implementation of stable whisper