jamesmf / mnistCRNN

Simple TimeDistributed() wrapper Demo in Keras; sums images of MNIST digits
61 stars 24 forks source link

Query regarding the training part #7

Closed Sroy20 closed 7 years ago

Sroy20 commented 7 years ago

Hi,

First, thanks for this awesome piece of code. Very good and helpful work!

I need your help in understanding a part of the code. You have written the model.fit function inside the for loop for training and let it run for one epoch.

model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=1, verbose=1)

So, the idea is that at each epoch it sees different data - isn't it untraditional? In typical scenarios, at each epoch, a neural network should see the same data, right?

Thanks, Subhrajit Roy

jamesmf commented 7 years ago

Glad you've found it useful.

In general with a finite dataset one epoch will contain all the data. In this instance, we can generate an enormous number of potential examples ( (num_examples_in_MNIST+1)^num_digits_per_example ), so we can't iterate over all the data in each epoch. So we sample some number of those possible combinations to generate a dataset that we use for each 'epoch.'

The benefit to that is that you can train longer because it is harder to overfit. The downside is if you care about monitoring your training loss, you aren't comparing 'apples-to-apples' across epochs.

Sroy20 commented 7 years ago

Thanks for the quick response. Yes, I get your point. Very smart!