Closed karllandheer closed 7 months ago
Hi
Thank you for your message and my apologies for the delay in replying to you. The Venn-Abers and their inductive counterpart do come with validity guarantees (see e.g Proposition 1, Section 2 in https://proceedings.neurips.cc/paper/2015/file/a9a1d5317a33ae8cef33961c34144f84-Paper.pdf) , providing that the training, calibration and test data is all i.i.d. If they are not, then unfortunately those guarantees do not hold.
One simple way to test this for your data would be to join and randomly permute the whole dataset and then split it into train/calibration/test splits manually, if this is possible? Under this scenario, IVAPs and most CVAPs should achieve near ideal calibration.
Regards,
Ivan
On Thu, 18 Jan 2024 at 20:28, karllandheer @.***> wrote:
Hello, I have been using your nice package. I used it to calibrate a large public dataset, and although the calibration has improved, it's far from perfect:
image.png (view on web) https://github.com/ip200/venn-abers/assets/76697225/2da4c188-6c22-416b-9de4-554747652dc3
image.png (view on web) https://github.com/ip200/venn-abers/assets/76697225/bacc4ccb-abac-48a4-8b61-b2cb3f60469a
I'm aware of some of the statistical guarantees of other conformal predictor methods, does this implementation of Venn Abers come with these guarantees? Or is my testing data perhaps from a different distribution from my calibration data (very possible).
— Reply to this email directly, view it on GitHub https://github.com/ip200/venn-abers/issues/14, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJJSY4B3FOZGII7MSOO7Z5LYPGAV3AVCNFSM6AAAAABCA4KXIWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA4DSMBQGE2DGOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hello, I have been using your nice package. I used it to calibrate a large public dataset, and although the calibration has improved, it's far from perfect:
I'm aware of some of the statistical guarantees of other conformal predictor methods, does this implementation of Venn Abers come with these guarantees? Or is my testing data perhaps from a different distribution from my calibration data (very possible).