ip200 / venn-abers

Python implementation of binary and multi-class Venn-ABERS calibration
MIT License
130 stars 12 forks source link

Pre-fitted calibration for multiclass #15

Closed tuvelofstrom closed 6 months ago

tuvelofstrom commented 6 months ago

We would like to be able to calibrate a pre-fitted multiclass classifier (e.g. a RandomForestClassifier). Right now, it seems to be impossible to use this implementation for this particular case, as it only seem to support IVAP and CVAP for multiclass.

We would typically like to do the same as in the simple_classification.ipynb but for multiclass as well:

# Pre-fitted Venn-ABERS calibration
X_train_proper, X_cal, y_train_proper, y_cal = train_test_split(
    X_train, y_train, test_size=0.2, shuffle=False
)

clf.fit(X_train_proper, y_train_proper)
p_cal = clf.predict_proba(X_cal)
p_test = clf.predict_proba(X_test)

va = VennAbersCalibrator()
va_prefit_prob = va.predict_proba(p_cal=p_cal, y_cal=y_cal, p_test=p_test)
ip200 commented 6 months ago

Hi tuvelofstrom thank you for raising this issue. I am currently working on several enhancements to the package , which once finalised, will contain this functionality as well as the one raised by karllandheer on being able to calibrate multi-class probabilities generated by classifiers which are not compatible with scikit-learn OnevsOneClassifier library. I should have these ready soon and will keep you posted. Thanks

ip200 commented 6 months ago

Hi I have created a new branch with added functionality which can be found here:

https://github.com/ip200/venn-abers/tree/multiclass_general

Please feel free to try the code above when you have a sec, it allows for pre-fitted multiclass probabilities to be calibrated using the Venn-ABERS method, without the need to supply an underlying scikit learn classifier.

Please also note the following:

Once you had a chance to review, I can easily merge this into the main branch. Thanks

ip200 commented 6 months ago

Hi Tuwe, please see version venn-abers 1.4.2 on PyPy pip install venn-abers==1.4.2 . Example of the new method can be found in https://github.com/ip200/venn-abers/blob/main/examples/multiclass_classification.ipynb cell 20. Hope this helps!