关于 sv 识别结果的问题

modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Apache License 2.0

1.07k stars 93 forks source link

关于 sv 识别结果的问题 #35

Closed anonymous530 closed 10 months ago

anonymous530 commented 10 months ago

问题描述

使用 SV 进行声纹验证，一段音频是存在人声的音频，另一段音频几乎没有声音（没有人声）。验证结果应该是低于阈值 0.6，但是结果却是高于0.6。想问下对于模型的识别结果，能获取到判断依据么？另外这个 threshold 一般应该设置多少合适？

使用模型

damo/speech_campplus_sv_cn_cnceleb_16k

识别结果

{'score': 0.68535, 'text': 'yes'}

yfchenlucky commented 10 months ago

阈值选择根据测试样本而定，你使用的是modelscope上我们提供的学术模型，你可以使用已发布的工业模型，阈值暂时使用默认值，识别性能会更优。参考：https://modelscope.cn/models/damo/speech_eres2net_sv_zh-cn_16k-common/summary or https://modelscope.cn/models/damo/speech_campplus_sv_zh-cn_16k-common/summary

anonymous530 commented 10 months ago

阈值选择根据测试样本而定，你使用的是modelscope上我们提供的学术模型，你可以使用已发布的工业模型，阈值暂时使用默认值，识别性能会更优。参考：https://modelscope.cn/models/damo/speech_eres2net_sv_zh-cn_16k-common/summary or https://modelscope.cn/models/damo/speech_campplus_sv_zh-cn_16k-common/summary

好的，我试下，谢谢。