jeffheaton / encog-dotnet-core

http://www.heatonresearch.com/encog
Other
430 stars 150 forks source link

Wrong standard deviation in example SunSpotTimeseries #95

Open TisVeugen opened 8 years ago

TisVeugen commented 8 years ago

I’m running through the Encog 3.3: Quick Start Guide . Chapter 2.3.5 shows the results of the SunSpotTimeseries case. In the box of the “normalization stats” the sd of SSN is terribly high: 1,830.873430. This value is wrongly calculated based on the Years from which the mean value of SSN is subtracted, and then squared. This is caused by the fact that the “Years” and “Month” columns are not used in the model. But, when computing the sd, each line of the input file is read, and the first value (the year) in that line is used to compute the sd of the firstly defined column SSN. I solved this mismatch in SunSpotTimeseries.java (package org.encog.examples.guide.timeseries) by defining 4 columns instead of 2, e.g.:

ColumnDefinition columnYear = data.defineSourceColumn("Year",0,ColumnType.continuous); ColumnDefinition columnMonth = data.defineSourceColumn("Month",1,ColumnType.continuous); ColumnDefinition columnSSN = data.defineSourceColumn("SSN",2,ColumnType.continuous); ColumnDefinition columnDEV = data.defineSourceColumn("DEV",3,ColumnType.continuous);

Please, can this solution or any other solution be done? Probably, also in the C# file? Thanks and best regards, Tis Veugen

jeroldhaas commented 8 years ago

Please submit a PR with the proposed changes for the C# solution.

Also, please submit a similar PR for the Java code at encog-java-core.

Please note: creating issues instead of submitting pull requests adds additional labor to contributors of the project (and increases the probability that the problem doesn't get fixed), when instead you yourself could be a contributor to the Encog project. :smile: