The normalisation of Fbank?

Walleclipse / Deep_Speaker-speaker_recognition_system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

245 stars 81 forks source link

The normalisation of Fbank? #48

Closed LeoniusChen closed 3 years ago

LeoniusChen commented 4 years ago

The fbank is normalised at the dimension of feature rather than frame, which is different from the definition. I'm confused with this. Could you tell me why?

Walleclipse commented 4 years ago

Sorry, I am not an expert in this field. I do not know which is better.
I only know, in the field of speech, generally normalize the Fbank in feature dimension. You can refer the following discussions: How to normalize MFCCs Effect of MFCC normalization on vector quantization based speaker identification Feature Normalisation for Robust Speech Recognition