ip200 / venn-abers

Python implementation of binary and multi-class Venn-ABERS calibration
MIT License
114 stars 11 forks source link

Using multiclass VA without scikit-learn estimator? #5

Closed karllandheer closed 4 months ago

karllandheer commented 11 months ago

Hello, I would like to use multi-class VA without a scikit-learn estimator. In short, I have two lists of calibration output softmax values from a neural net as well as the ground truth labels (N classes) -- one for calibration, one for testing. I have previously used class VennAbers for binary classification with similar data (obviously just 2 classes), however the functionality of VennAbersMultiClass() is different -- it requires a scikit-learn estimator, not just calibration / test data. Is there a way to use your VA implementation to achieve what I want? To convert softmax scores to calibrated multi-class probabilities? Or am I missing something?

Thanks again for all your help!

ip200 commented 11 months ago

Hi, thank you for this request. At the moment The multi-class Venn-ABERS calculation relies on the use of one-vs-one classification. This is achieved by passing the underlying scikit-learn classifier into the OneVsOnceClassifier wrapper which converts them into an equivalent set of binary-class classifiers per pair of classes. While this functionality is readily available in scikit-learn, as far as I know the equivalent functionality does not currently exist in Tensorflow or PyTorch. I will investigate some alternative options which could be more general and which could enable you to use the standard softmax outputs to get multi-class Venn Abers probabilities without relying on scikit-learn OneVsOneClassifier functionality. I'll keep you posted, thanks

ip200 commented 4 months ago

Hi I am copying you in the response to the same issue raised by tuvelofstrom. I have created a new branch with added functionality which can be found here:

https://github.com/ip200/venn-abers/tree/multiclass_general

Please feel free to try the code above when you have a sec, it allows for pre-fitted multiclass probabilities to be calibrated using the Venn-ABERS method, without the need to supply an underlying scikit learn classifier.

Please also note the following:

Once you had a chance to review, I can easily merge this into the main branch. Thanks

ip200 commented 4 months ago

Hi please see version venn-abers 1.4.2 on PyPy pip install venn-abers==1.4.2 . Example of the new method can be found in https://github.com/ip200/venn-abers/blob/main/examples/multiclass_classification.ipynb cell 20. Hope this helps!