jeffheaton / encog-java-core

http://www.heatonresearch.com/encog
Other
743 stars 268 forks source link

Provide a random data set #140

Open jeffheaton opened 11 years ago

jeffheaton commented 11 years ago

Encog needs a random data set. Such a data set would enclose a regular data set. It would then return randomly selected elements from the enclosed data set. You would be able to set the number of elements that the random data set would have. For example, you might set the dataset to have 1000 elements. If this dataset enclosed a data set with 10,000 elements each iteration would select 1000 random elements from the 10,000 sized data set. This was inspired by this forum post:

http://www.heatonresearch.com/comment/reply/3128#comment-form

nitbix commented 11 years ago

Jeff if I understand correctly I think some of that is already in the EnsembleDataSetFactory class family (org.encog.ensemble.data.factories) - maybe it's an easy adaptation from there. I already have another one ready for random selection without resampling, which I will contribute back along with dropout. I hope that helps in any way.

On 8 June 2013 01:18, Jeff Heaton notifications@github.com wrote:

Encog needs a random data set. Such a data set would enclose a regular data set. It would then return randomly selected elements from the enclosed data set. You would be able to set the number of elements that the random data set would have. For example, you might set the dataset to have 1000 elements. If this dataset enclosed a data set with 10,000 elements each iteration would select 1000 random elements from the 10,000 sized data set. This was inspired by this forum post:

http://www.heatonresearch.com/comment/reply/3128#comment-form

— Reply to this email directly or view it on GitHubhttps://github.com/encog/encog-java-core/issues/140 .

Alan

PetrToman commented 11 years ago

Stratified sampling could also be implemented - in order to help with classification of skewed classes.