tmoroney / auto-subs

Generate subtitles using OpenAI Whisper in Davinci Resolve editing software.
MIT License
430 stars 22 forks source link

Timing of single word subtitles are inaccurate #28

Open Xander1125 opened 5 months ago

Xander1125 commented 5 months ago

When using one word per line, the subs are off by anywhere from 1 to 7 frames on average, with some subs being off by a few seconds. By the end of a 1 minute clip, the subtitles will be off by several seconds in total. The only solution seems to be manually adjusting 80% or more of the subtitle positions, and because they aren't consistent (I.e all off by the same amount of frames), it is very time consuming to fix.

Is there any best practices for improving the timing? Does it have to do with project framerate, or the settings of the initial audio export by the script?

Superpotato99 commented 5 months ago

Yea the timings are not great even with the "improve timings" checkmark. This plugin is still a massive improvement over the default subtitles though!

Xander1125 commented 5 months ago

The timings are seemingly improved as the language model size is increased. If I use another program altogether that allows for the large model to be used, I can get consistent results that are accurate within a couple frames of where the words should be, and that is an acceptable tolerance.

Can we get an update that allows for the large whisper model to be used?

Currently I am getting good results (meaning no or negligeable timing or word touch up's required) by using a combination of three tools, and it should be more or less easy to combine them into this tool which already has resolve integration.