模型训练卡顿 - Githubissues

Walleclipse / Deep_Speaker-speaker_recognition_system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

245 stars 81 forks source link

模型训练卡顿 #66

Closed ZengHorace closed 3 years ago

ZengHorace commented 4 years ago

前辈，模型我试了好几次，每次跑个几百steps后，就卡在beginning to select.......... 也就是选择最优batch的时候，一动不动，这是啥原因啊，您有遇到过吗，我是用的自己的数据集，训练不是一开始就这样，而是跑了几百steps变成这样 beginning to select.......... select best batch time 0.0702s select_batch_time: 0.7750017642974854 2020-04-28 15:55:15,745 [INFO] train.py/main | == Presenting step #348 2020-04-28 15:55:16,176 [INFO] train.py/main | == Processed in 0.43s by the network, training loss = 0.7954046726226807. get batch time 5.48e-06s forward process time 0.684s beginning to select.......... 不动了

ZengHorace commented 4 years ago

    while True:
        speaker = anh_speakers[ii]
        inds = anchs_index_dict[speaker]
        np.random.shuffle(inds)
        anchor_index = inds[0]
        pinds = []
        for jj in range(1,len(inds)):
            if (hist_features[anchor_index] == hist_features[inds[jj]]).all():
                continue
            pinds.append(inds[jj])

        if len(pinds) >= 1:
            break

我的理解是在这里陷入了死循环，这里为啥会陷入死循环啊

Walleclipse commented 4 years ago

你好，如果你使用的是我给的sample data，那么由于数据很少，有可能会出现选择数据的 pinds 不足的情况，所以一直会死循环。解决方法就是你可以使用完整的 LibriSpeech 数据。如果你用了全部数据，那么可能select batch本身跑的很慢。你可以参考一下 issue 37

ZengHorace commented 4 years ago

        speaker = anh_speakers[ii]
        inds = anchs_index_dict[speaker]
        np.random.shuffle(inds)
        anchor_index = inds[0]
        pinds = []
        for jj in range(1,len(inds)):
            if (hist_features[anchor_index] == hist_features[inds[jj]]).all():
                continue 
            pinds.append(inds[jj])

这一块代码，按我的理解，选择一个speaker，拿到该speaker的若干个语音特征，然后比对第一条语音特征与其他语音特征是否相等，我觉得都不会相等啊，怎么会进入死循环呢

ZengHorace commented 4 years ago

同一个人的每段语音特征，都不会完全相等啊，总共512个数字，是我哪里没理解到吗

ZengHorace commented 4 years ago

if (hist_features[anchor_index] == hist_features[inds[jj]]).all(): 前辈这行代码意义是啥啊

ZengHorace commented 4 years ago

原因找到了，数据集不干净，保存的.npy数据里面全是0，谢谢前辈

Walleclipse commented 4 years ago

if (hist_features[anchor_index] == hist_features[inds[jj]]).all(): 前辈这行代码意义是啥啊

这就是比较已经选择的 anchor utterance 和接下来的untterance 是否是完全一样的？如果完全一样就说明是同一段语音，直接跳过