Closed jumfina closed 6 years ago
extract_feature( ) produces a matrix of vectors of dimensions 399 by 40. As I understand, 40 is the 20 MFCC+ 20 Delta MFCC features of a single speaker file. Could you say what is contained in the 399 dimension?
399 should be your No. of frames. You basically separate the audio into small frames, and you got 20 MFCC + 20 DMFCC from each frame
extract_feature( ) produces a matrix of vectors of dimensions 399 by 40. As I understand, 40 is the 20 MFCC+ 20 Delta MFCC features of a single speaker file. Could you say what is contained in the 399 dimension?