chidiwilliams / buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
https://chidiwilliams.github.io/buzz
MIT License
12.64k stars 948 forks source link

Subtitles generated for a 1.5 hour long video, the timeline is inaccurate #955

Open guangxuanliu opened 1 month ago

guangxuanliu commented 1 month ago

When transcribing a 1.5 hours long video, the generated subtitles have an inaccurate timeline and do not match the sound.

Even when using Whisper Large-v3, the situation remains the same.

What adjustments do I need to make the generated subtitles more accurate?

Operating system: Windows 10 Software version: Buzz 1.1.0

guangxuanliu commented 1 month ago

In addition, buzz performs well when transcribing short videos.

raivisdejus commented 1 month ago

Some ideas that may help in short term are here https://github.com/chidiwilliams/buzz/discussions/946

Work on longer term solution is in progress

guangxuanliu commented 1 month ago

Ok, Thanks for your reply and advice. hope new version can solve this problem.