Input dataset - Githubissues

astorfi / 3D-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Apache License 2.0

780 stars 274 forks source link

Input dataset #26

Closed toshikwa closed 6 years ago

toshikwa commented 6 years ago

Hi @astorfi I have some questions about input dataset.

According to the paper, the number of speakers is 511 in the development phase. But how long is the input audio file per speaker ??
Although there is the function of CMVN preprocessing in input_feature.py, I'm not sure whether CMVN preprocessing is appropriate for the output of speechpy.feature.lmfe function. Did you use CMVN preprocessing in the experiment of the paper??

Thank you for your work!!

astorfi commented 6 years ago

@ku2482 Thanks for your question.

Regarding your questions:

0.8-second as it is also mentioned in the paper.
No for the paper, no CMVN has been used. CMVN is just an available feature in SpeechPy library.

toshikwa commented 6 years ago

@astorfi Thanks for answering.

I think every 0.81 second audio file result in (80, 40) feature, and you concatenate 20 features to make (20, 80, 40) feature for development phase, is it right? I don't know how many (20, 80, 40) features per speaker do you use in the paper . You use just one (20, 80, 40) feature for one speaker and make the dataset shaped (511, 20, 80, 40) ??

Anyway, I appreciate for your work and kindness.

astorfi commented 6 years ago

@ku2482 Yes, that's quite correct.

For the second part, (20, 80, 40) features are fed to the network. "20" is the number of spoken utterances for the speaker. However, there is no restriction on the number of (20, 80, 40) features for any speaker. The rule of thumb is "the more is the better for background model generation". You can use "20" spoken utterances at random for data augmentation (although all needs to belong to the same speaker).

toshikwa commented 6 years ago

@astorfi Thank you so much!!

I actually solve all my questions and now I can understand your script. Your work is really great!!

I close this issue, and again, thank you!!