jeffheaton / encog-java-core

http://www.heatonresearch.com/encog
Other
743 stars 268 forks source link

Recurrent freeform networks are broken (NullPointerException) #160

Closed ekerazha closed 10 years ago

ekerazha commented 10 years ago

As I wrote here http://www.heatonresearch.com/comment/6404#comment-6404, recurrent freeform networks are broken.

There are many ways to reproduce the problem, here I took the "ElmanXOR" example and I tried to convert the Elman network to a freeform Elman network (applying minimal changes).

public class ElmanXOR {

    // *** USE THE FreeformNetwork.createElman() METHOD ***
    /*static BasicNetwork createElmanNetwork() {
        // construct an Elman type network
        ElmanPattern pattern = new ElmanPattern();
        pattern.setActivationFunction(new ActivationSigmoid());
        pattern.setInputNeurons(1);
        pattern.addHiddenLayer(6);
        pattern.setOutputNeurons(1);
        return (BasicNetwork)pattern.generate();
    }*/

    static BasicNetwork createFeedforwardNetwork() {
        // construct a feedforward type network
        FeedForwardPattern pattern = new FeedForwardPattern();
        pattern.setActivationFunction(new ActivationSigmoid());
        pattern.setInputNeurons(1);
        pattern.addHiddenLayer(6);
        pattern.setOutputNeurons(1);
        return (BasicNetwork)pattern.generate();
    }

    public static void main(final String args[]) {

        final TemporalXOR temp = new TemporalXOR();
        final MLDataSet trainingSet = temp.generate(120);

        //final BasicNetwork elmanNetwork = ElmanXOR.createElmanNetwork();
        // *** USE THE FreeformNetwork.createElman() METHOD ***
        final FreeformNetwork elmanNetwork = FreeformNetwork.createElman(1, 6, 1, new ActivationSigmoid());
        final BasicNetwork feedforwardNetwork = ElmanXOR
                .createFeedforwardNetwork();

        //final double elmanError = ElmanXOR.trainNetwork("Elman", elmanNetwork,
        //      trainingSet);
        // *** USE THE EncogUtility.trainToError() METHOD ***
        EncogUtility.trainToError(elmanNetwork, trainingSet, 0.000001);
        final double feedforwardError = ElmanXOR.trainNetwork("Feedforward",
                feedforwardNetwork, trainingSet);       

        //System.out.println("Best error rate with Elman Network: " + elmanError);
        System.out.println("Best error rate with Feedforward Network: "
                + feedforwardError);
        System.out
                .println("Elman should be able to get into the 10% range,\nfeedforward should not go below 25%.\nThe recurrent Elment net can learn better in this case.");
        System.out
                .println("If your results are not as good, try rerunning, or perhaps training longer.");

        Encog.getInstance().shutdown();
    }

    public static double trainNetwork(final String what,
            final BasicNetwork network, final MLDataSet trainingSet) {
        // train the neural network
        CalculateScore score = new TrainingSetScore(trainingSet);
        final MLTrain trainAlt = new NeuralSimulatedAnnealing(
                network, score, 10, 2, 100);

        final MLTrain trainMain = new Backpropagation(network, trainingSet,0.000001, 0.0);

        final StopTrainingStrategy stop = new StopTrainingStrategy();
        trainMain.addStrategy(new Greedy());
        trainMain.addStrategy(new HybridStrategy(trainAlt));
        trainMain.addStrategy(stop);

        int epoch = 0;
        while (!stop.shouldStop()) {
            trainMain.iteration();
            System.out.println("Training " + what + ", Epoch #" + epoch
                    + " Error:" + trainMain.getError());
            epoch++;
        }
        return trainMain.getError();
    }
}

What I always get is a NullPointerException when I try to train the network.

Thanks.

jeffheaton commented 10 years ago

Thanks, I will take a look.

jeffheaton commented 10 years ago

I checked in code that fixes the NPE. I don't believe it is training correctly, though, and am still looking at that.

ekerazha commented 10 years ago

@jeffheaton

The original ElmanXOR example uses a hybrid strategy with the NeuralSimulatedAnnealing training class, while for freeform networks we only have backpropagation and resilient propagation, is this correct?

Are you always testing the Elman topology? Ideally the training should also work for more complex topologies with multiple context layers.

jeffheaton commented 10 years ago

Just checked in another fix. I also added a ElmanFreeform example. This example works just like the flat network example, it uses hybrid with simulated annealing. It works with Elman just fine now, and trains to error levels I would expect.

I am still going to keep this issue open as there are a few more things I would like to do. Namely I want to figure out what training methods all work with freeform. Backpropagation and resilient will work. I suspect anything that is not derivative based will work, i.e. annealing, genetic, PSO, and NelderMead might well work. But I doubt very much that SCG, Levenberg Marquadt would. They would require a special version to be written for freeform. Maybe. But I want to try each, it if its a minor issue preventing the trainer from working, I'll fix it.

Secondly, now that I actually understand how the freeform nets work, I am going to add a wiki page on them that describes how to put together custom structures with them.

ekerazha commented 10 years ago

Thank you for your effort :)

jeffheaton commented 10 years ago

NPE is fixed, and it seems to train for SRN's now. I aded an example for that. Also wrote up some docs for freeform. http://www.heatonresearch.com/wiki/Freeform_Network

ekerazha commented 10 years ago

@jeffheaton what about the isRecurrent parameter?

There are 2 possibilities: 1) It is useful but the current implementation doesn't look at it. 2) It is totally useless and it should be removed.

jeffheaton commented 10 years ago

I sent an email to the original contributor, but he has not responded. My thoughts are that it is "useful, but currently unused". I am going to leave it in, for now. There are really two type of recurrent networks. Recurrent, with a context neuron. This is what the freeform networks currently support. But there are also completely recurrent networks, such as NEAT networks. Here you simply have to run the network a fixed number of cycles to actually calculate the recurrent links. When I ultimately expand freeform to support NEAT style recurrence, then I might well need such a flag.