ip200 / venn-abers

Python implementation of binary and multi-class Venn-ABERS calibration
MIT License
134 stars 12 forks source link

Guarantees of Venn Abers? #14

Closed karllandheer closed 7 months ago

karllandheer commented 10 months ago

Hello, I have been using your nice package. I used it to calibrate a large public dataset, and although the calibration has improved, it's far from perfect:

image

image

I'm aware of some of the statistical guarantees of other conformal predictor methods, does this implementation of Venn Abers come with these guarantees? Or is my testing data perhaps from a different distribution from my calibration data (very possible).

ip200 commented 10 months ago

Hi

Thank you for your message and my apologies for the delay in replying to you. The Venn-Abers and their inductive counterpart do come with validity guarantees (see e.g Proposition 1, Section 2 in https://proceedings.neurips.cc/paper/2015/file/a9a1d5317a33ae8cef33961c34144f84-Paper.pdf) , providing that the training, calibration and test data is all i.i.d. If they are not, then unfortunately those guarantees do not hold.

One simple way to test this for your data would be to join and randomly permute the whole dataset and then split it into train/calibration/test splits manually, if this is possible? Under this scenario, IVAPs and most CVAPs should achieve near ideal calibration.

Regards,

Ivan

On Thu, 18 Jan 2024 at 20:28, karllandheer @.***> wrote:

Hello, I have been using your nice package. I used it to calibrate a large public dataset, and although the calibration has improved, it's far from perfect:

image.png (view on web) https://github.com/ip200/venn-abers/assets/76697225/2da4c188-6c22-416b-9de4-554747652dc3

image.png (view on web) https://github.com/ip200/venn-abers/assets/76697225/bacc4ccb-abac-48a4-8b61-b2cb3f60469a

I'm aware of some of the statistical guarantees of other conformal predictor methods, does this implementation of Venn Abers come with these guarantees? Or is my testing data perhaps from a different distribution from my calibration data (very possible).

— Reply to this email directly, view it on GitHub https://github.com/ip200/venn-abers/issues/14, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJJSY4B3FOZGII7MSOO7Z5LYPGAV3AVCNFSM6AAAAABCA4KXIWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA4DSMBQGE2DGOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>