Closed alan-breck closed 4 years ago
v1 doesn't have minibatch. When you provide a set samples, it just train on each sample independently. v2 has the real minibatch implementation.
When calling update
separately for each item of the minibatch, the results match.
Thanks.
When migrating a toy neural network from version 1 (
smile.classification.NeuralNetwork
) to version 2 (smile.classification.MLP
), I would expect equivalent results, but for some reason they aren't. Randomness of minibatches aside, training and evaluation data are exactly the same in both cases, so probably the proper way to generate or train the network differs from my usage.Version 1 code
Version 2 code
Results comparison
The version 1 always classifies correctly a small validation set of nine items. The version 2 always misses at least one of them. Even with more iterations, bigger or smaller minibatches, wider or thinner layers, and deeper networks, it stubbornly fails to match accuracy of version 1.
If the difference is not obvious with the information provided, any suggestion about where to investigate (debug, tweak parameters, trace...) will be more than welcome.