Closed LluisRamon closed 8 years ago
Just passing weights
won't work since that isn;t the resampled version of the values (and the dimensions don't match the data). I've checked in a change that allows it:
> library(caret)
>
> set.seed(1)
> dat <- twoClassSim(100)
>
> set.seed(2)
> with_weights <- train(Class ~ ., data = dat, method = modelInfo, weights = (1:100)/100)
> set.seed(2)
> no_weights <- train(Class ~ ., data = dat, method = modelInfo)
>
> with_weights
Random Forest
100 samples
15 predictor
2 classes: 'Class1', 'Class2'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 100, 100, 100, 100, 100, 100, ...
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.6906904 0.1830998
8 0.7085037 0.2774630
15 0.7004598 0.2775892
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 8.
> no_weights
Random Forest
100 samples
15 predictor
2 classes: 'Class1', 'Class2'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 100, 100, 100, 100, 100, 100, ...
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.6870326 0.1888561
8 0.7105888 0.2957754
15 0.7121800 0.3166938
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 15.
Hi Max,
Now I get why passing weights directly didn't work. Thanks for the explanation.
I have seen in commit ed14146f714c8339005770080e2772a2e05ae952 that ranger method now accepts class weights, so I close the issue.
Thank you very much.
Hi, Sorry, I didn't get the previous explanation. I am very new to R. Could you tell me how to rectify my code below. The weights are in the range of 1 to 4000 and not normalised. It is a survey data. I am getting this error "Error in rangerCpp(treetype, dependent.variable.name, data.final, variable.names, : Not compatible with requested type: [type=character; target=double]." Thanks.
hyper_grid<- expand.grid( mtry = seq(10, 310, by = 50), node_size = seq(3, 9, by = 2),
OOB_RMSE = 0 )
for(i in 1:nrow(hyper_grid)){
model <- ranger( formula = CS4_pvt ~.-WT, case.weights = "WT", data = traindata1, num.trees = 1491, mtry = hyper_grid$mtry[i], min.node.size = hyper_grid$node_size[i], importance = "impurity", seed = 123456 )
hyper_grid$OOB_RMSE[i] <- sqrt(model$prediction.error) }
Hi Max,
Now I get why passing weights directly didn't work. Thanks for the explanation.
I have seen in commit ed14146 that ranger method now accepts class weights, so I close the issue.
Thank you very much.
Hi, Could you please explain how did you solve the issue with using case weights. Sorry, I didn't understand the explanation. Could you please help me resolving the error below. WT2 is in decimals.
random_forest_govt2 <- ranger(CS4_govt ~ CS22 + CS23 + TA10A + Nchild_adult + Income_person
I get an error - Error in rangerCpp(treetype, dependent.variable.name, data.final, variable.names, : Not compatible with requested type: [type=character; target=double].
Hi Max,
I have seen that package
ranger
now accepts weights using parametercase.weights
. It is included in CRAN version 0.4. .If I use
case.weights
insidetrain
dots, it gives me an error. I created a custom function like the one below and it seems to work fine. If nocase.weights
, ranger expectsNULL
as intrain
, this is why I includewts
directly to ranger.Not sure if this is a feature requesting weights in
ranger
or a bug when I use them inside dots.If you need a reproducible example of the error or a pull request to method ranger I'll be happy to provide them.
Thank you very much,