topepo / caret

caret (Classification And Regression Training) R package that contains misc functions for training and plotting classification and regression models
http://topepo.github.io/caret/index.html
1.61k stars 634 forks source link

Error / Bug report: RFE with lmFuncs - " there should be the same number of samples in x and y" #1289

Open 992005 opened 2 years ago

992005 commented 2 years ago

Hello,

I've been trying to run LmFuncs with caret, but keep receiving the following error message - Error in rfe.default(training[, -1], training[, 1], sizes = c(2, 3), rfeControl = control) : there should be the same number of samples in x and y

I've looked through stackoverflow, github, and google groups - but none of the information on those forums seems to get around the error. My target variable is continuous, and the other variables are either numeric or continuous. I suspect there's a bug somewhere.

I'm fairly new to R, and this is the first bug report I've made - so please let me know if there were any instructions I failed to follow.

Thanks, Sarah

Minimal dataset:

CEM <- structure(list(EBI_SUM = c(243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 243, 355, 355, 355, 355, 355, 355, 355, 355, 355), WCS_19 = c(27, 78, 10, 75, 80, 22, 56, 85, 90, 85, 60, 81, 100, 100, 19, 50, 27, 78, 10, 75, 80, 22, 56, 85, 90)), row.names = c(NA, -25L), class = c("tbl_df", "tbl", "data.frame"))


#### Minimal, runnable code:
```{R}
library(caret)
set.seed(2022)
inTrain <- createDataPartition(CEM$EBI_SUM, p = .8, list = F)
training <- CEM[inTrain, ]
testing <- CEM[-inTrain, ]
control = rfeControl(functions = lmFuncs, method = "repeatedcv", repeats = 5, verbose = F, returnResamp = "all")
rfe_lm_profile <- rfe(training[, -1], training[, 1], sizes = c(2, 3), rfeControl = control)

Session Info:


R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)