I noticed that for the classification models of part 2, weights are initialized using nn.init.kaiming_normal_ (He initialization). However, when biases are initialized (p2ch14.modelline 94), that is done in a strange way using nn.init.normal_(m.bias, -bound, bound). I find it hard to understand why they are sampled from a Gaussian distribution with mean -bound and standard deviation bound. I believe that it's probably a leftover from a previous uniform initialization.
Should lines 91-94 be replaced with m.bias.data.fill_(.0)?
I noticed that for the classification models of part 2, weights are initialized using
nn.init.kaiming_normal_
(He initialization). However, when biases are initialized (p2ch14.model
line 94), that is done in a strange way usingnn.init.normal_(m.bias, -bound, bound)
. I find it hard to understand why they are sampled from a Gaussian distribution with mean-bound
and standard deviationbound
. I believe that it's probably a leftover from a previous uniform initialization.Should lines 91-94 be replaced with
m.bias.data.fill_(.0)
?The same also holds for
p2ch11.model
, lines 43-46p2ch12.model
, lines 43-46p2ch13.model
, lines 41-44