Walleclipse / Deep_Speaker-speaker_recognition_system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
245 stars 81 forks source link

使用aishell数据集实验遇到的问题 #46

Closed LittleMaWen closed 2 years ago

LittleMaWen commented 4 years ago

学长抱歉,再次打扰您!上次我是用train-clean-100完整数据集进行实验后,想用aishell中文语料库跑一次程序看看结果。数据已经按照代码中的audio samples预处理好了。但是在python train.py的过程中代码报错,请问您知道原因吗? Found 0120418 files with 120418 different speakers. Traceback (most recent call last): File "train.py", line 189, in main() File "train.py", line 71, in main batch = stochastic_mini_batch(libri, batch_size=c.BATCH_SIZE, unique_speakers=unique_speakers) File "/home/dcase/mawen/SW/aishell/Deep_Speaker-speaker_recognition_system-master/random_batch.py", line 89, in stochastic_mini_batch mini_batch = MiniBatch(libri, batch_size,unique_speakers) File "/home/dcase/mawen/SW/aishell/Deep_Speaker-speaker_recognition_system-master/random_batch.py", line 45, in init two_different_speakers = np.random.choice(unique_speakers, size=2, replace=False) File "mtrand.pyx", line 1125, in mtrand.RandomState.choice ValueError: 'a' cannot be empty unless no samples are taken

Walleclipse commented 4 years ago

你好, 你得到的 unique_speakers 可能是空的,你检查一下吧。可以查看 这里 或者 这里

LittleMaWen commented 4 years ago

学长,不是这个问题,我找到了解答,是aishell里的音频文件命名的问题。你所给出的数据集是一个人对应多少句话是对应好的,但aishell在预处理音频文件后得到的文件命名好像识别不出来一个人对应多少句话。Found 0120418 files with 120418 different speakers.它就自动变成一人一句话了。我修改了三个人的大概500多句语音,按照你给出的数据集命名格式修改了,然后拿这三个人去训练,就没有这个错误了

yaoyao1206 commented 3 years ago

学长,不是这个问题,我找到了解答,是aishell里的音频文件命名的问题。你所给出的数据集是一个人对应多少句话是对应好的,但aishell在预处理音频文件后得到的文件命名好像识别不出来一个人对应多少句话。Found 0120418 files with 120418 different speakers.它就自动变成一人一句话了。我修改了三个人的大概500多句语音,按照你给出的数据集命名格式修改了,然后拿这三个人去训练,就没有这个错误了

同学你好,我也遇到了和你一样的问题,请问你是集体怎么修改数据集的命名格式的

Chloe-qiuyu commented 3 years ago

您好!我也是把aishell数据集处理好然后训练出现了这个问题,各种百度都没有解决 :( 想请问下这个问题出在什么地方? Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/2f592ba9440443f8863ede3d2d2b4927/zqy/Deep_Speaker-speaker_recognition_system-master/select_batch.py", line 98, in addstack feature, labels = preprocess(unique_speakers, spk_utt_dict, candidates) File "/2f592ba9440443f8863ede3d2d2b4927/zqy/Deep_Speaker-speaker_recognition_system-master/selectbatch.py", line 73, in preprocess x = np.load(file) File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 453, in load pickle_kwargs=pickle_kwargs) File "/usr/local/lib/python3.6/dist-packages/numpy/lib/format.py", line 785, in read_array array.shape = shape ValueError: cannot reshape array of size 0 into shape (419,64,1) 换做英文数据集就任何问题也没有

Walleclipse commented 3 years ago

这应该是数据文件夹的问题。你需要把文件夹改成和Librispeech文件夹 一样的格式。也就是:一个人对应多少句话是对应好的 请查看 audio/LibriSpeechSamples/train-clean-100. 首先是speaker 比如说 192627. 然后每一个speaker里面有这个speaker对应的语音 比如说,属于speaker 19 的语音:19/198.