Closed chim3y closed 5 years ago
This is a combinatorics "issue" with the randomForest package as it is highly inefficient to compute all possible split-combinations with categorical features with lots of levels.
So, you should try either the ranger
package (maybe this works as ranger
is a bit faster) or you need to change the data, e.g., convert the categorical features to dummy features with mlrtask = createDummyFeatures(mlrtask)
.
Hello Sir/Madam, For the past 2 days, I have running following program to compute training time for 3 learners: random forest, logistic regression and gradient boost. However, for data.id=4135, it generates following error and goes into infinite loop . Please, can you point out why I'm getting the error? and how I can solve it? Thank you in advance for your time.
Error: Warning in train(learner, task, subset = train.i, weights = weights[train.i]) : Could not train learner classif.randomForest: Error in randomForest.default(m, y, ...) : Can not handle categorical predictors with more than 53 categories.
Program:
Task used
tasks = listOMLTasks(limit = NULL)
classifTasks.infos = subset(tasks, task.type == "Supervised Classification" & # classification
number.of.classes == 2 & # binary classification
number.of.instances.with.missing.values == 0) # no missing values
save(classifTasks.infos, file = "Data/OpenML/classifTasks.infos.RData" )
}**