I was working with CrossValidationKFold when found a problem while making some tests with the structure FoldedDataSet.
My scenario:
I have a dataset with 50 examples. I made a FoldedDataSet with 5 parts, resulting in each part have 10 examples. When I'm iterating on all the folds, everything looks fine, except on the final fold. This is the code that I am running:
final MLDataPair pair = BasicMLDataPair.createPair(data.getInputSize(), data.getIdealSize());
for (int i = 0; i < data.getRecordCount(); i++) {
data.getRecord(i, pair);
network.compute(pair.getInputArray(), actual);
...
}
data is a FoldedDataSet.
When calling for the getRecordCount() on the last fold, it returns 0, but the actual answer is 10.
Looking on the code, I found out that when the division is exact, the variable lastFoldSize is zero. And on the method setCurrentFold(), when the fold indicated is the last one, the lastFoldSize is assigned to the currentFoldSize variable, utilized to return the record count.
I was working with CrossValidationKFold when found a problem while making some tests with the structure FoldedDataSet.
My scenario: I have a dataset with 50 examples. I made a
FoldedDataSet
with 5 parts, resulting in each part have 10 examples. When I'm iterating on all the folds, everything looks fine, except on the final fold. This is the code that I am running:data
is a FoldedDataSet.When calling for the
getRecordCount()
on the last fold, it returns 0, but the actual answer is 10.Looking on the code, I found out that when the division is exact, the variable
lastFoldSize
is zero. And on the methodsetCurrentFold()
, when the fold indicated is the last one, thelastFoldSize
is assigned to thecurrentFoldSize
variable, utilized to return the record count.