This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
作者你好: 我购买了9.9yuan模型并测试了算法,发现效果挺好的。但是我遇到2个女声样本,测试后发现非常难以区分:
随机再取1.3s来比较:
不管从这2个样本随机采样多少个1.3s, 发现相似度一直很高(0.6~0.7)左右。这是很奇怪的,这种困难的corner case有啥好的区分办法吗? 非常感谢。
附上这2个音频: 2samples-hard.zip