Open cqjjjzr opened 5 years ago
If each sample file has much different noise characteristic and high noise energy, the mean and variance can be depends on noise signal rather than speech signal. However, the purpose of VAD is utilizing the speech signal's statistical characteristic, global mean and variance are likely to have speech signal's mean and variance rather than noise as severe noise situation is not frequent.
It is not a mistake as we cannot find global mean and variance from test dataset, however, if you use the local mean, and variance from each sample file, you can use local mean and variance from the test file if you want.
Thanks for your reply!
One more question, when the program is being used in production environment, is there any difference between using local mean and variance from each input file and using global train mean and stdvariance? If so, which should I choose?
Hi Kim,
Apologize for disturbing you for many times, but I have problem understanding your normalization code. I found some code in the
acoustic_feat_ex.m
:and in every
data_reader_XXX.py
:My questions are:
acoustic_feat_ex.m
? Why don't calculate factor for every single train file and apply normalization on it?data_reader_XXX.py
s are also used during the prediction)? Is this a mistake?Thanks in advance!
Charlie Jiang