Closed justinxzhao closed 3 weeks ago
An example that works on the current code is here: https://github.com/ludwig-ai/experiments/blob/main/automl/heuristics/santander_customer_satisfaction/eval_util.py with an example invocation here: https://github.com/ludwig-ai/experiments/blob/main/automl/heuristics/santander_customer_satisfaction/train_tabnet_imbalance_ros.py
Largely a duplicate of #2158
Ludwig uses a default threshold of 0.5 to calculate accuracy for binary classification problems. However, it's highly possible, especially for imbalanced datasets that a threshold of 0.5 is not the best threshold to use.
The AUC measures the performance of a binary classifier averaged across all possible decision thresholds, and is commonly used to determine a better threshold that gets a better balance of precision and recall.
One such algorithmic outline, proposed by @geoffreyangus and @w4nderlust:
By default, the optimal threshold should be calculated at the end of the training phase.
It would also be useful to expose this as a standalone API.