gsganden / model_inspector

A uniform interface to a curated set of methods for inspecting machine learning models
https://gsganden.github.io/model_inspector/
Apache License 2.0
4 stars 0 forks source link

Mimic `sklearn._binary_clf_curve` for choosing metrics thresholds #52

Open gsganden opened 1 year ago

gsganden commented 1 year ago

calculate_metrics_by_thresh uses all informative thresholds by default but lets you pass in your own set of thresholds e.g. if you want to use fewer so that it runs faster. IIUC scikit-learn's precision_recall_curve by constrast uses sklearn._binary_clf_curve to select an informative subset of np.linspace(0, 1, 101). I think that's a better approach because it limits the maximum number of thresholds at 100, which should pretty much always be more than enough, without requiring the user to do anything. We would probably want to copy sklearn._binary_clf_curve to avoid depending on a private function from another library.

Moosquibe commented 1 year ago

I agree, I can't imagine a scenario where more than a 100 thresholds give anything extra informative!