Closed xiaochi-liu closed 1 year ago
I'm pretty sure this one is right as published.
One repeat of 10-fold (or 3-fold, etc) CV generates one assessment set prediction for each training set sample:
Notice that each sample from the training set gets a prediction one time, when it is in the assessment set ("estimate performance using...").
If we repeated that 5 times, we would get five assessment set predictions for each observation in the training set.
Thank you very much for your kind guidance, Julia!
Is there a different understanding of "the assessment set"? Based on this figure:
My understanding is like this:
When we do 3-fold cross-validation, we get 3 assessment sets. These 3 assessment sets bind together so that each sample from the training set gets a prediction. Thus, if we do 5 repeats 3-fold cross-validation, we will have $5 \times 3 = 15$ assessments.
This is exactly right:
When we do 3-fold cross-validation, we get 3 assessment sets. These 3 assessment sets bind together so that each sample from the training set gets a prediction. Thus, if we do 5 repeats 3-fold cross-validation, we will have 5×3=15 assessments.
With 3-fold CV, there are 3 assessment sets and each training set observation is in one of these, so each training set observation gets one prediction, when it is in the assessment set. There are no predictions when something is in the analysis set.
If we repeat that 5 times, there will be 15 assessments sets. Each training set observation will be in 5 assessment sets (one at each repeat) so there will be 5 predictions made for each training set observation.
Got it. Now I totally understand. Thank you very much, Julia!
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Not sure whether I understand correctly here. We are using five repeats of 10-fold cross-validation, so shouldn't we have $5 \times 10 = 50$ assessment sets?