Closed MarkRebuck closed 7 years ago
If anyone is looking at this...
Adding (new ConsistentRandomizer(-1,1,944267208)).randomize(network); after network.reset(); in the stock XORHelloWorld.java provides a consistent failure-to-train for me. Adding a network.dumpWeights() shows the hidden layer weights going off the reservation during training.
Thank you for the information. I was able to reproduce with your data. I also tried your version of the sigmoid. But it still fails to converge with that starting point.
Hi Geoff, we are seeing this issue too. Any progress solving it? I'm more than happy to investigate, would you have any idea where a good starting point would be? Cheers
This is really just an issue of a bad set of initial weights.
I am testing the feasibility of replacing a home-grown Neural Network library with encog. As a first step of my feasibility study, I replaced my existing XOR unit test with an encog-based version. My unit test is almost a direct cut/paste from the XORHelloWorld sample.
However, every so often the unit test fails. Included below is a slightly modified version the XORHelloWorld which tries to train XOR 1000 times, stopping if it can not train after 1000000 iteration()s. On every JVM I tried, I get a repeatable 4-5 failures per 1000. While I understand that not every network will train every problem every time... We're talking about XOR here, the worlds most simple Neural Network problem. Clearly something is amiss :-).
When the network fails to train, it fails quite badly. In other words, it doesn't "get close but not all the way to tolerance". It "goes off the deep end and gives wonky results".
On inspection, I saw that the sample tries to train all the way to 0/1, which is a difficult thing to do when using Sigmoid activation/output functions. Changing the sample targets from the defaults to: public static double XOR_IDEAL[][] = { { 0.01 }, { 0.99 }, { 0.01 }, { 0.99 } }; ...trained quickly 10 million out of 10 million times. Problem solved, right? Well...
While this could be considered an issue with the sample training to unreasonable values, I believe it goes deeper than that. Looking at ActivationSigmoid, BoundMath, and BoundNumbers, it appears that encog handles sigmoid and its derivative quite poorly at the extremes. The home-grown network I'm replacing had a similar issue many years ago, which was fixed by doing the following (feel free to implement this as a patch if you wish):
Without the proper bounds checks, I believe encog is merrily wandering off towards a divide-by-zero error while caculating the derivative of sigmoid() near its limits.
Here is the modified version of XORHelloWorld which shows the sporadic failure: