tidymodels / workflows

Modeling Workflows
https://workflows.tidymodels.org/
Other
204 stars 21 forks source link

Custom probability thresholds for binary classification ? #64

Open yangxg opened 4 years ago

yangxg commented 4 years ago

Dear Authors:

I am aware that there is a plan to add the feature of custom probability threshold for classification in workflow package. But it seems not published yet. I also notice that the probably package is very helpful to find out the approriate threshold value (the link), but I don't know how to integrate it into the workflow.

As this feature is important to one of my projects, so I am wondering is there a alternative approach to archieve it in the whole tidymodels workflow?

Anyway, I am also look forward to the one-line solution in the future! Thanks again for the great work!

Xiaoguang

Teett commented 3 years ago

Hi!!

I was wondering if there is any update in this issue? This is kind of an important feature when dealing with classification problems that require a higher specificity (e.g. for mortality prediction and health risks) and at the moment I feel like the only way is to do it in tidymodels is something of this sort:

collect_predictions(final_model) %>%
  mutate(corrected_class = as_factor(case_when(.pred_alive > 0.75 ~ "alive",
                                               TRUE ~ "dead"))) %>% # Manual threshhold
  conf_mat(truth = vital_state, estimate = corrected_class) 

I think this workaround is accurate for a confusion matrix? But we still need to be able to calculate the new metrics and I can't find a way to do it in a simple way using yardstick.

Thank you!!

StevenWallaert commented 1 year ago

Hi, Would there be any update on this?

I am aware of {probably} and how to look for an optimal thershold after training. However, I would like to tune the threshold within a workflow and treat the threshold as any other hyperparameter.

Many thanks!