Zero features model in RFE

Is there any reason why zero feature model (i.e. intercept/mean only model) is not allowed within caret RFE? I am referring to caret::rfe(sizes) argument.

Why is this important? Because the intercept only model might actually be the best possible model. Here is a performance profile (from corrected caret) where you see the 0 features model is the best model, and adding extra features simply overfits:

Currently, specifying sizes = 0:200 in caret is no different than sizes = 1:200. Caret will always take at least 1 feature, so in this case 0 and 1 features would be the same. See here for the adjustments in caret code required to make zero features models possible in RFE.

Hence I would either allow zero feature models like in the adjustment above, or check and throw an error if 0 %in% sizes to point to the user that 0 features is not supported.

PS I am aware that there are metrics that might suggest that the zero features model is the best one, but RMSE, which may be an ideal metric for performance analysis in some cases, is not one of them. So I don't think that "use an other performance metric" is a valid argument for disallowing zero feature models.

topepo / caret

Zero features model in RFE #1369