jdermody / brightwire

Bright Wire is an open source machine learning library for .NET with GPU support (via CUDA)
https://github.com/jdermody/brightwire/wiki
MIT License
125 stars 19 forks source link

IrisClassification: Perfect using .Net5, 96.6% using .Net4.6 ? #42

Closed PatriceDargenton closed 3 years ago

PatriceDargenton commented 3 years ago

Hello, I am testing the IrisClassification, and the result is perfect (100%) using this latest .Net5 version, while using the previous .Net4.6 version with exactly the same settings, I can't do better than 96.67%, which is a typical performance for this well-known test. An idea ?

``

        // fire up some linear algebra on the CPU
        using (var lap = BrightWireProvider.CreateLinearAlgebra(
            stochastic:true)) { // Patrice : false -> true

            // create a neural network graph factory
            var graph = new GraphFactory(lap);

            // the default data table -> vector conversion uses one hot encoding of the classification labels, so create a corresponding cost function
            var errorMetric = graph.ErrorMetric.OneHotEncoding;

            // create the property set (use rmsprop gradient descent optimisation)
            graph.CurrentPropertySet
                .Use(graph.RmsProp())
                .Use(graph.GaussianWeightInitialisation(
                    true, 0.1f, GaussianVarianceCalibration.SquareRoot2N)) // Patrice
                ;

            // create the training and test data sources
            var trainingData = graph.CreateDataSource(split.Training);
            var testData = trainingData.CloneWith(split.Test);

            // create a 4x3x3 neural network with sigmoid activations after each neural network
            const int HIDDEN_LAYER_SIZE = 8,
                BATCH_SIZE = 16; // 8; // Patrice : 8 -> 16
            const float LEARNING_RATE = 0.1f; //0.01f; // Patrice : 0.01 -> 0.1
            var engine = graph.CreateTrainingEngine(trainingData, LEARNING_RATE, BATCH_SIZE, 
                TrainingErrorCalculation.TrainingData);
            graph.Connect(engine)
                .AddFeedForward(HIDDEN_LAYER_SIZE)
                .Add(graph.SigmoidActivation())
                .AddDropOut(dropOutPercentage: 0.5f)
                .AddFeedForward(engine.DataSource.OutputSize)
                .Add(graph.SigmoidActivation())
                .AddBackpropagation(errorMetric)
            ;

            // train the network
            Console.WriteLine("Training a 4x8x3 neural network...");
            engine.Train(500, testData, errorMetric, null, 50);

``

jdermody commented 3 years ago

Hi, thanks for raising this. While I agree that 100% is usually suspicious, I think in this case that due to the size of the Iris data set (150 total samples) then the final results depend on how the test and training sets are (randomly) assigned.

This can be demonstrated with a fixed RandomSeed when creating the data context:

RandomSeed = 0

using var context = new BrightDataContext(RandomSeed);
...
Naive bayes accuracy: 96.67%
Decision tree accuracy: 96.67%
Random forest accuracy: 96.67%
K nearest neighbours accuracy: 100.00%
Multinomial logistic regression accuracy: 90.00%
Training neural network...
Initial score: 30.00%
Epoch 50 - time: 0.00s; score: 73.33% [0.4333]!!
Epoch 100 - time: 0.00s; score: 83.33% [0.1000]!!
Epoch 150 - time: 0.00s; score: 100.00% [0.1667]!!
Epoch 200 - time: 0.00s; score: 100.00% [0.0000]!!

RandomSeed = 1

Naive bayes accuracy: 93.33%
Decision tree accuracy: 90.00%
Random forest accuracy: 93.33%
K nearest neighbours accuracy: 93.33%
Multinomial logistic regression accuracy: 90.00%
Training neural network...
Initial score: 33.33%
Epoch 50 - time: 0.00s; score: 66.67% [0.3333]!!
Epoch 100 - time: 0.00s; score: 96.67% [0.3000]!!
Epoch 150 - time: 0.00s; score: 93.33% [-0.0333]
Epoch 200 - time: 0.00s; score: 96.67% [0.0000]!!
PatriceDargenton commented 3 years ago

Thank You!