Closed D-X-Y closed 2 years ago
Dear Greg,
Awesome project! May I ask why the linear output layer in ResNet is initialized as 0 instead of Gaussian(mean=0, var=1) as mentioned in the paper?
Thanks a lot for your time and help.
Thanks @D-X-Y!
We discuss this in the paper in Section D.2.
Let us know if you have any question after reading this.
Thanks a lot for your explaination!
Dear Greg,
Awesome project! May I ask why the linear output layer in ResNet is initialized as 0 instead of Gaussian(mean=0, var=1) as mentioned in the paper?
Thanks a lot for your time and help.