Closed kirayomato closed 3 months ago
sentence_timestamp=True
sentence_timestamp=True
but why the lengths is different? I want to add punctuation by other model and need use the timestamp of word. if different,it will can't work...
sentence_timestamp=True
but why the lengths is different? I want to add punctuation by other model and need use the timestamp of word. if different,it will can't work...
Punctuation does not count.
sentence_timestamp=True
but why the lengths is different? I want to add punctuation by other model and need use the timestamp of word. if different,it will can't work...
Punctuation does not count.
I knew it.
import re
import string
from zhon.hanzi import punctuation
punctuation_zh = punctuation
punctuation_en = string.punctuation
punctuation_str = punctuation_zh + punctuation_en
res = funasr()
text = res[0]['text']
timestamp = res[0]['timestamp']
raw_text = re.sub('[' + punctuation_str + ']', '', text)
but len(raw_text) != len(timestamp).
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
Before asking:
What is your question?
我尝试利用funasr为我的视频生成字幕,但是发现识别得到的文本长度和时间戳长度并不相同。请问如何将文本和时间戳进行对应?
Code
What have you tried?
What's your environment?
pip
, source): pip