modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
4.46k stars 493 forks source link

说话人分割,用新闻联播音频,男女主持人的“晚上好”无法区分成两个人 #1853

Closed wuhongsheng closed 1 day ago

wuhongsheng commented 4 days ago

image

xxxzsgxxx commented 3 days ago

看看你用的模型呗?

image

java668 commented 1 day ago

怎么知道 SPEAKER_0 和 SPEAKER_2 那个是男的那个是女的?

wuhongsheng commented 1 day ago

找到原因了,CAM++说话人识别有个合并短句speakid 把这个阈值调小点可以规避 image