casperkaae / parmesan

Variational and semi-supervised neural network toppings for Lasagne
Other
208 stars 31 forks source link

Large dataset #45

Closed rslprpr closed 8 years ago

rslprpr commented 8 years ago

I would like to know why in "mnist_ladder.py" example you mentioned it doesn't work for larger dataset? What should I change if I wanna use it for large dataset? Shall I pass mini-batches through network as "sym_x" ?

casperkaae commented 8 years ago

Its because the batch normalisation statistics are collect by passing the entire training dataset through the network instead of calculating a running average using mini-batches

rslprpr commented 8 years ago

Thanks for complete answer. I've change the alpha value to "0.5" to do not pass entire training set.

`class NormalizeLayer(lasagne.layers.Layer):

def __init__(self, incoming, axes=None, epsilon=1e-10, alpha='0.5',
             return_stats=False, stat_indices=None,
             **kwargs):`
rslprpr commented 7 years ago

I'm using Parmesan for CNN and my training data is so large. To evaluate the results it is not possible to pass entire training dataset to the "f_collect" function. That's why I tried it with mini-batch with size 100. I got very bad validation accuracy of "0.45" after 40 epocs where the training accuracy is around "0,78". I want to know is there any way to collect the statistics rather than passing entire training dataset?