The FRNN uses mini-batch gradient descent implemented around the Keras train_on_batch() method. Each epoch is terminated as the ensemble of workers sees at least a certain number of events num_total, before individual batch generators on worker GPUs are exhausted.
The pull request implements changes in the MPIMolde.train_epoch method preventing reseting the generators on worker GPUs before they are exhausted.
Another feature implemented in this PR is Keras Callbacks. Keras Callbacks provide a more standardized approach to monitoring training/validation history and statistical summaries for variables. In addition it provides solutions for EarlyStopping (potentially, learning rate adjustment, model checkpointing and CSV summaries too)
The pull request implements changes in the MPIMolde.train_epoch method preventing reseting the generators on worker GPUs before they are exhausted.