qiuqiangkong / audioset_tagging_cnn

MIT License
1.32k stars 249 forks source link

Why transpose(1, 3) before BatchNorm? #39

Open ado-ml opened 3 years ago

ado-ml commented 3 years ago

May I ask why do transpose(1, 3) before BN? Is it intended to do batch normalization for each frequency bin, what is the advantage for this? Thanks.

x = x.transpose(1, 3) x = self.bn0(x) x = x.transpose(1, 3)

qiuqiangkong commented 3 years ago

That is equivalent to normalize each frequency bin. For example, if there are 64 mel bins, then there are 64 mean values and 64 standard values for normalization.