topepo / FES

Code and Resources for "Feature Engineering and Selection: A Practical Approach for Predictive Models" by Kuhn and Johnson
https://bookdown.org/max/FES
GNU General Public License v2.0
724 stars 237 forks source link

Section 3.4.1 variance reduction not Sqrt(R) #105

Open TimothyMasters opened 2 years ago

TimothyMasters commented 2 years ago

Section 3.4.1 discusses R repeats of V-fold cross validation and incorrectly states that the variance reduction will be by a factor of Sqrt(R). This would be true only if the measures across repeats were independent, which they are not. In fact, the variance reduction depends on the value of V. As an extreme example, for leave-one-out cross validation there can be no variance reduction at all. Even at the other extreme, with V=2, variance reduction will not reach Sqrt(R). To help understand the issues involved, consider two facts: there are a finite (though very large) number of possible partitions for cross validation, and the training set is itself a random sample from the population.

topepo commented 2 years ago

We do note that it is an approximation.

I don't think that it is misleading since there are many instances where the resampled statistics can be treated as independent even though the resamples contain some of the same data. The bootstrap is a good example and there are multiple proofs that should that those resamples converge to the empirical distribution of the resampled statistic.

So, you are absolutely correct that the variance reduction is not equal to Sqrt(R). I'm arguing that the text accurately indicates that the benefits of adding replicates is decreasing with R.