deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.
https://www.manning.com/books/deep-learning-with-pytorch
4.69k stars 1.98k forks source link

Strange bias initialization with He initialization #72

Open pietroventurini opened 3 years ago

pietroventurini commented 3 years ago

I noticed that for the classification models of part 2, weights are initialized using nn.init.kaiming_normal_ (He initialization). However, when biases are initialized (p2ch14.model line 94), that is done in a strange way using nn.init.normal_(m.bias, -bound, bound). I find it hard to understand why they are sampled from a Gaussian distribution with mean -bound and standard deviation bound. I believe that it's probably a leftover from a previous uniform initialization.

Should lines 91-94 be replaced with m.bias.data.fill_(.0)?

The same also holds for

t-vi commented 3 years ago

I completely agree. Thank you for pointing this out! I'd probably use nn.init.zeros_ to keep with the nn.init theme.