Walleclipse / Deep_Speaker-speaker_recognition_system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
245 stars 81 forks source link

模型训练卡顿 #66

Closed ZengHorace closed 3 years ago

ZengHorace commented 4 years ago

前辈,模型我试了好几次,每次跑个几百steps后,就卡在beginning to select.......... 也就是选择最优batch的时候,一动不动,这是啥原因啊,您有遇到过吗,我是用的自己的数据集,训练不是一开始就这样,而是跑了几百steps变成这样 beginning to select.......... select best batch time 0.0702s select_batch_time: 0.7750017642974854 2020-04-28 15:55:15,745 [INFO] train.py/main | == Presenting step #348 2020-04-28 15:55:16,176 [INFO] train.py/main | == Processed in 0.43s by the network, training loss = 0.7954046726226807. get batch time 5.48e-06s forward process time 0.684s beginning to select.......... 不动了

ZengHorace commented 4 years ago
    while True:
        speaker = anh_speakers[ii]
        inds = anchs_index_dict[speaker]
        np.random.shuffle(inds)
        anchor_index = inds[0]
        pinds = []
        for jj in range(1,len(inds)):
            if (hist_features[anchor_index] == hist_features[inds[jj]]).all():
                continue
            pinds.append(inds[jj])

        if len(pinds) >= 1:
            break

我的理解是在这里陷入了死循环,这里为啥会陷入死循环啊

Walleclipse commented 4 years ago

你好, 如果你使用的是我给的sample data,那么由于数据很少,有可能会出现 选择数据的 pinds 不足的情况,所以一直会死循环。解决方法就是你可以使用完整的 LibriSpeech 数据。 如果你用了全部数据,那么可能select batch本身跑的很慢。你可以参考一下 issue 37

ZengHorace commented 4 years ago
        speaker = anh_speakers[ii]
        inds = anchs_index_dict[speaker]
        np.random.shuffle(inds)
        anchor_index = inds[0]
        pinds = []
        for jj in range(1,len(inds)):
            if (hist_features[anchor_index] == hist_features[inds[jj]]).all():
                continue 
            pinds.append(inds[jj])

这一块代码,按我的理解,选择一个speaker,拿到该speaker的若干个语音特征,然后比对第一条语音特征与其他语音特征是否相等,我觉得都不会相等啊,怎么会进入死循环呢

ZengHorace commented 4 years ago

同一个人的每段语音特征,都不会完全相等啊,总共512个数字,是我哪里没理解到吗

ZengHorace commented 4 years ago

if (hist_features[anchor_index] == hist_features[inds[jj]]).all(): 前辈这行代码意义是啥啊

ZengHorace commented 4 years ago

原因找到了,数据集不干净,保存的.npy数据里面全是0,谢谢前辈

Walleclipse commented 4 years ago

if (hist_features[anchor_index] == hist_features[inds[jj]]).all(): 前辈这行代码意义是啥啊

这就是比较已经选择的 anchor utterance 和 接下来的untterance 是否是完全一样的? 如果完全一样就说明是同一段语音,直接跳过