topepo / FES

Code and Resources for "Feature Engineering and Selection: A Practical Approach for Predictive Models" by Kuhn and Johnson
https://bookdown.org/max/FES
GNU General Public License v2.0
716 stars 237 forks source link

Nested resampling #89

Open maximilianpfau opened 4 years ago

maximilianpfau commented 4 years ago

Print version (not able to find date on book's first page) Chapter 3.4 Page 48

Footnote 24:

In fact, many people use the terms “training” and “testing” to describe the splits of the data produced during resampling. We avoid that here because 1) resampling is only ever conducted on the training set samples and 2) the terminology can be confusing since the same term is being used for different versions of the original data.

This appears to be slightly oversimplified, given that nested resampling my be used to generate outer test-set folds and inner analysis and inner assessment folds.

Would it maybe make sense to rephrase this footnote and even add a small section on nested resampling?

On a similar note:

Is there a specific reason to use the terms "analysis and assessment sets" for the (inner) parameter tuning / feature selection resampling instead of "training and validation sets" as in NN-terminology?

Thank you for the great book.