topepo / caret

caret (Classification And Regression Training) R package that contains misc functions for training and plotting classification and regression models
http://topepo.github.io/caret/index.html
1.61k stars 634 forks source link

Allow summary functions for RFE variability estimation #1370

Open DavorJ opened 2 months ago

DavorJ commented 2 months ago

Currently (v6.0-94), only internal MeanSD function is supported in RFE to compute the variability of the (repeated) CV metrics. As long as one assumes (repeated) CV samples coming from a normal distribution, this is perfectly fine. But in case this assumption is not taken, then reporting only the SD is very limiting.

A simple addition would be to allow the user to specify an e.g. "summarySD" function in caret::rfeControl(functions), which can point to MeanSD by default. A very quick and dirty implementation one can find here. This would allow much more flexibility for custom implementations of selectSize RFE functions since one could pass extra information and not only the SD.