I am trying to do feature selection using glmStepAIC with the following code.
control <- trainControl(method="repeatedcv", number=10, repeats=3)
step_train <- train(grouping ~., data = train_data,
method = "glmStepAIC", direction = "forward", family = "binomial",
trControl = control)
My question is how the feature selection and cross validation are conducted. Is stepwise selection done for each fold and the features in step_train$finalModel are those with highest votes? If so, the performance metric shown in step_train$resample is the performance of each fold with different sets of features? It matters because I need to know if I should do another cross validation to evaluate performance of the selected features (like below).
features <- names(step_train$finalModel$coefficients)[-1]
step_eval <- train(grouping ~., data = train_data[, c(features, "grouping") ],
method = "glm", family = "binomial",
trControl = control)
