Closed AdityaSoni19031997 closed 6 years ago
@AdityaSoni19031997 As you can see that both of your PR has bugs. I cannot merge them. Just because training has started does not guarantee that the code is bug free. Take your time until it starts to converge. Around 20% of the 1st epoch, it should have a decent accuracy. & Be patient while writing code. ;)
i had multiple commits to fix the NUM_Filter one
Scale can be anything , if someone is happy with scale as 1
it can be modified
Done the modification
Did you make it converge? Run the training, and wait to atleast 20-25% of the 1st epoch and paste the screenshot here. It should converge.
For stddev. confusion I don't know. But what i have read & what the code in the above link is showing is the same. Maybe that's a little twisted model[your reference], i don't have any idea about that.
The accuracy is increasing and loss is going down so that's a sufficient indication of your code running like a charm...
@zishansami102
Because i had seen that Keras does it that way..
Keras is different than your earlier code. In Keras : stddev = np.sqrt(scale) where scale /= max(1. , fan_in)
We should change the theta also?
@zishansami102 added lecun_normal which uses gaussian distribution to initialise weights