UnPassive / NeuralNetwork

Multi-layer feedforward and Radial basis function neural networks
MIT License
1 stars 0 forks source link

Partitioning Data Set for k-fold analysis #9

Open sanchewy opened 6 years ago

sanchewy commented 6 years ago

Unless we want to do the kfold by hand, it might be a good idea to cut the data input (once it is loaded into the array) into 5 parts. That way we can do 5fold cross validation automatically, running each 4/5 training set until convergence and then testing it using each 1/5 test set. I guess this could also be done by hand, but then we would need a "test" option where we load test data (otherwise the machine is just going to load it like another training set and go to town rather than keeping state and outputting predictions).

sanchewy commented 6 years ago

Because the data is in an array of arraylists (i.e. Data = ArrayList<Double>[] inputVector;) you could just cut the inputVector arrays into 5 parts by dividing the inputVector.length by 5. Then alternate training on each of the different sections until convergence and then calculate the error on the training set. After that you just average the 5 different training set error calculations and you have the average error across all of the trials for 5-fold cross validation (and you don't have to do it by hand).