mli / autocut

用文本编辑器剪视频
Apache License 2.0
6.6k stars 659 forks source link

请问有没有选项可以做到一字一断呢 #84

Open chenmiaomiao opened 1 year ago

chenmiaomiao commented 1 year ago

我是最近才开始做视频,其实我不知道该不该一字一断,但是视频里面有些小错误我想去掉,想偷懒,不知道可不可以把模型设成一字一断或者一词一段?

如果没有的话,我大致有个思路,先按照正常的长度去识别,再把句子断开,最后把断开的句子和音频再匹配出时间。我挺想把这个想法实现一下,不知道有没有这个必要。

Jonham commented 1 year ago

whisper新出的API,支持word-level.

momobobe commented 1 year ago

https://github.com/linto-ai/whisper-timestamped this one has already implemented it, so wait for any contributor to work on its adaptation @mli @yihong0618 @zcf0508

Also this https://github.com/m-bain/whisperX for reference