Open amaneureka opened 8 years ago
Possible solution: Exponentially decrease learning rate proportional to confidence of network.
Why does it get confused ?
Solution itself explains it. It's not good idea to move saturated weights with small momentum when they are already saturated. Hence it should decrease it's learning rate as network get confident on its decision.
In a long run, sometimes the network gets confused and plays lame from next game.