explain how to take a single wav file and extract features

astorfi / 3D-convolutional-speaker-recognition

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Apache License 2.0

778 stars 275 forks source link

I've read your paper and it's really impresive.

Would like to ask you regarding the input preprocessing:

Assume I've got a wav file consisting 0.8 sec fs, signal = wav.read(file_name) Then I use mfec=speechpy.feature.mfe(signal,fs) the size if mfec is [79,40] so I changed the input file to be 0.81sec and then I received [80,40]...

according to your paper I need [20,80,40] to create one training example so I can create this by duplication my original [80,40] by 20 (this is how you did at testing phase) or by concatenating 20 different utterances of 0.81sec. Is that correct?

Any clarifications would be appreciated!

Alan

astorfi / 3D-convolutional-speaker-recognition

explain how to take a single wav file and extract features #13