测试出现问题 - Githubissues

dongzhenguo2016 commented 4 years ago

学长你好，我下载了train-clean-100数据集，然后只选择了其中的254、289、298三个人的数据，因为每个人大致有120段左右的语音，所以我按照训练：测试=4:1这样的比例，划分出了训练和测试数据集。然后我运行train.py，去训练这三个人的训练数据，原本你的程序里是while循环，我在里面设置了当train loss 小于1的时候，就break。训练没有报错，一会儿就结束了。但是我在运行test_model.py去测试这三人的测试数据时，突然报了下面的错误，学长你知道是怎么回事吗？

Found checkpoint [checkpoints/model_28000_4.25404.h5]. Resume from here... Found 0000064 files with 00003 different speakers. Traceback (most recent call last): File "test_model.py", line 177, in fm, tpr, acc, eer = eval_model(model, check_partial=True,gru_model=gru_model) File "test_model.py", line 115, in eval_model x, y_true = create_test_data(test_dir,check_partial) File "test_model.py", line 75, in create_test_data negative_files = libri[libri['speaker_id'] != unique_speakers[ii]].sample(n=num_neg, replace=False) File "/home/hdc/.local/lib/python3.7/site-packages/pandas/core/generic.py", line 4865, in sample locs = rs.choice(axis_length, size=n, replace=replace, p=weights) File "mtrand.pyx", line 1168, in mtrand.RandomState.choice ValueError: Cannot take a larger sample than population when 'replace=False' 谢谢学长

Walleclipse commented 4 years ago

1) 你的测试数据太少了，很可能小于三个人的数据。所以sample的时候不同speaker的数据量少于num_neg，无法采样不同的num_neg个数据。强烈建议你增加数据量。
2) 如果你不想增加数据量可以试试把报错的那一段也就是 test_model.py 的58行的 replace 改成 True：
negative_files = libri[libri['speaker_id'] != unique_speakers[ii]].sample(n=num_neg, replace=True)

dongzhenguo2016 commented 4 years ago

谢谢学长，按照你提供的建议，弄好了，可以了。祝学长学业顺利，一帆风顺

Walleclipse / Deep_Speaker-speaker_recognition_system

测试出现问题 #42