yeyupiaoling / VoiceprintRecognition-PaddlePaddle

本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
Apache License 2.0
218 stars 44 forks source link

When there is only one data in the audio_db, will crash #10

Closed fengzstrong closed 1 year ago

fengzstrong commented 1 year ago

Hi,

Great job!

I encountered a problem when using it. When there is only one data in the audio_db(just resigter one wav), an error will occur. Do you have any suggestions

请选择功能,0为注册音频到声纹库,1为执行声纹识别:1 Traceback (most recent call last): File "/usr/local/src/VoiceprintRecognition-PaddlePaddle/infer_recognition.py", line 51, in name = predictor.recognition(audio_data) File "/usr/local/src/VoiceprintRecognition-PaddlePaddle/ppvector/predict.py", line 300, in recognition name = self.__retrieval(np_feature=[feature])[0] File "/usr/local/src/VoiceprintRecognition-PaddlePaddle/ppvector/predict.py", line 160, in __retrieval similarity = cosine_similarity(self.audio_feature, feature.reshape(1, -1)).squeeze() File "/data1/bigdata/miniconda3/envs/paddlespeech/lib/python3.9/site-packages/sklearn/metrics/pairwise.py", line 1393, in cosine_similarity X, Y = check_pairwise_arrays(X, Y) File "/data1/bigdata/miniconda3/envs/paddlespeech/lib/python3.9/site-packages/sklearn/metrics/pairwise.py", line 155, in check_pairwise_arrays X = check_array( File "/data1/bigdata/miniconda3/envs/paddlespeech/lib/python3.9/site-packages/sklearn/utils/validation.py", line 902, in check_array raise ValueError( ValueError: Expected 2D array, got 1D array instead: array=[ -9.528404 20.831924 -7.2488213 -10.465841 3.648481 -27.02867 -21.46711 -7.2084546 -2.5436175 16.762877 -23.119123 -19.69215 19.306414 -29.709349 10.007728 9.711845 15.237292 -31.4283 13.326845 24.98609 -0.93558085 -13.328272 -1.4318435 -2.5589817 -14.899953 -10.004118 8.370364 -28.427952 -16.635942 17.125128 -21.187462 -13.563347 22.93637 0.2699321 -42.41188 -32.501728 -0.88023186 -17.82453 22.414608 -4.979337 -8.525855 23.49937 2.4326832 45.067253 -23.60708 -30.05538 1.6135166 -40.467884 6.419506 -22.83227 14.336002 -6.9231305 -2.9081142 -3.9221501 22.34546 15.799733 -31.135666 -11.1763735 -36.390778 20.186132 -1.3171989 3.5721273 -6.8223796 -0.87155807 0.6096292 -0.7906767 24.010586 -30.601904 42.77444 4.056578 8.387045 -32.088486 1.4989619 -16.874323 -14.909355 10.0754 26.727545 3.1605248 -7.187451 17.319191 -29.09326 -10.794649 16.416176 -39.16383 8.402718 -18.068346 -24.327047 -10.149664 32.352417 29.281029 23.015427 39.01204 -0.17142385 18.62544 -43.314125 -3.381555 -24.813742 -9.385822 -14.046436 -30.988276 -31.660748 -11.767179 12.45567 -14.2988205 -18.520817 -4.230826 7.557493 19.553474 30.747616 -22.761354 9.038652 30.985561 -43.875 -6.442013 -13.843025 -22.249443 -9.539998 -8.984698 -6.8686695 -10.849445 -3.2795053 -10.321569 20.782562 -33.36194 29.603746 5.8829403 -16.764643 33.195232 26.174906 -10.609048 -0.93014306 21.179016 3.711417 -30.199566 -3.0830624 -15.195986 3.977836 -22.489988 -32.214226 -38.9151 11.913775 5.5171957 -16.352848 -14.75191 -25.292871 12.144376 9.523496 31.895811 12.43035 25.481085 19.512949 2.1031296 28.158194 0.76135576 6.5883613 2.3754144 -9.671471 26.664793 15.48469 -17.29369 6.898872 11.445733 9.200221 -12.630247 -10.417504 0.6524699 -0.5872514 -9.257348 -18.316444 -13.710389 7.0560064 -28.609156 8.623157 -5.470796 -3.8638942 16.686884 -11.8199415 -30.546083 16.687752 9.732939 -27.76715 -24.231022 -3.1489227 -10.5073 -0.15581396 7.884112 -11.454353 -6.92933 20.55611 -7.311349 25.531921 -8.121537 ]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

yeyupiaoling commented 1 year ago

你打印下 feature.reshape(1, -1)的shape是不是(1, 192) 如果不是,可以用feature[np.newaxis, :] 转化,估计是你的feature.reshape(1, -1)不工作

yeyupiaoling commented 1 year ago

声纹库里面的音频不能少于2个