I ran this code 100 times, it is impossible to get an error of 0.00885 with this method of weighting. Almost always the error is more than 2%. The question remains that if there is any mistake in my setup and installation or this is another bug(Error) in the calculations?
Please take a look at this example(Page 46 of PDF):
cancer_unweighted = tsk("breast_cancer") summary(cancer_unweighted$data()$class)
add column where weight is 2 if class "malignant", and 1 otherwise
df = cancer_unweighted$data() df$weights = ifelse(df$class == "malignant", 2, 1)
create new task and role
cancer_weighted = as_task_classif(df, target = "class") cancer_weighted$set_col_roles("weights", roles = "weight")
compare weighted and unweighted predictions
split = partition(cancer_unweighted) lrn_rf = lrn("classif.ranger") lrn_rf$train(cancer_unweighted, split$train)$ predict(cancer_unweighted, split$test)$score()
lrn_rf$train(cancer_weighted, split$train)$ predict(cancer_weighted, split$test)$score()
I ran this code 100 times, it is impossible to get an error of 0.00885 with this method of weighting. Almost always the error is more than 2%. The question remains that if there is any mistake in my setup and installation or this is another bug(Error) in the calculations?