Closed rslprpr closed 8 years ago
Its because the batch normalisation statistics are collect by passing the entire training dataset through the network instead of calculating a running average using mini-batches
Thanks for complete answer. I've change the alpha value to "0.5" to do not pass entire training set.
`class NormalizeLayer(lasagne.layers.Layer):
def __init__(self, incoming, axes=None, epsilon=1e-10, alpha='0.5',
return_stats=False, stat_indices=None,
**kwargs):`
I'm using Parmesan for CNN and my training data is so large. To evaluate the results it is not possible to pass entire training dataset to the "f_collect" function. That's why I tried it with mini-batch with size 100. I got very bad validation accuracy of "0.45" after 40 epocs where the training accuracy is around "0,78". I want to know is there any way to collect the statistics rather than passing entire training dataset?
I would like to know why in "mnist_ladder.py" example you mentioned it doesn't work for larger dataset? What should I change if I wanna use it for large dataset? Shall I pass mini-batches through network as "sym_x" ?