Closed liuyoude closed 2 years ago
Hi there,
It is *2
, not **2
. We just normalized the input with smaller std.
If you don't use our AudioSet pretrained model, it is fine to use 0 mean and 1 std (i.e., fbank = (fbank - self.norm_mean) / (self.norm_std)
). Otherwise please keep the normalization consistent with us.
-Yuan
Hi there,
It is
*2
, not**2
. We just normalized the input with smaller std.If you don't use our AudioSet pretrained model, it is fine to use 0 mean and 1 std (i.e.,
fbank = (fbank - self.norm_mean) / (self.norm_std)
). Otherwise please keep the normalization consistent with us.-Yuan Thank you!
The operation https://github.com/YuanGongND/ast/blob/102f0477099f83e04f6f2b30a498464b78bbaf46/src/dataloader.py#L191 for normalizing should be divided by the standard deviation, not the variance?