MLP equivalent to NeuralNetwork

alan-breck commented 4 years ago

When migrating a toy neural network from version 1 (smile.classification.NeuralNetwork) to version 2 (smile.classification.MLP), I would expect equivalent results, but for some reason they aren't. Randomness of minibatches aside, training and evaluation data are exactly the same in both cases, so probably the proper way to generate or train the network differs from my usage.

Version 1 code

    NeuralNetwork classifier = new NeuralNetwork(ErrorFunction.CROSS_ENTROPY,
                                                 AMOUNT_OF_INPUT_VARS, AMOUNT_OF_HIDDEN_UNITS,
                                                 AMOUNT_OF_OUTPUT_LABELS);
    // then for each minibatch of MINI_BATCH_SIZE inputs and values, up to a total of EPOCHS iterations
    classifier.learn(inputs, values);

Version 2 code

    MLP classifier = new MLP(AMOUNT_OF_INPUT_VARS,
                             Layer.sigmoid(AMOUNT_OF_HIDDEN_UNITS),
                             Layer.mle(AMOUNT_OF_OUTPUT_LABELS, OutputFunction.SOFTMAX));
    // then for each minibatch of MINI_BATCH_SIZE inputs and values, up to a total of EPOCHS iterations
    classifier.update(inputs, values);

Results comparison

The version 1 always classifies correctly a small validation set of nine items. The version 2 always misses at least one of them. Even with more iterations, bigger or smaller minibatches, wider or thinner layers, and deeper networks, it stubbornly fails to match accuracy of version 1.

If the difference is not obvious with the information provided, any suggestion about where to investigate (debug, tweak parameters, trace...) will be more than welcome.

haifengl commented 4 years ago

v1 doesn't have minibatch. When you provide a set samples, it just train on each sample independently. v2 has the real minibatch implementation.

alan-breck commented 4 years ago

When calling update separately for each item of the minibatch, the results match. Thanks.

haifengl / smile