Question and documentation

lcrmorin commented 1 year ago

I have some relatively beginer questions after trying the intro code:

The wrapper do not modify the predict_proba, is that right ?
When compared with calibrating a rf on the complete training dataset X_train (same seed), the conformal approach doesn't always improve the metric (70% of the time it improve the Brier score), is there any reason that it doesn't work all the time ?

Small comment but we would need:

from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

for the code to work out of the box.

henrikbostrom commented 1 year ago

Hi,

Thanks for your questions and comment! Here are my answers in order:

It is correct that the wrapper does not affect predict_proba (or any of the other existing methods)
I am not sure that I fully follow you, but note that predict_p outputs p-values and not class probabilities (they do not sum to 1), hence the Brier score is not expected to be improved; the p-values do work in the sense that the p-values for the true labels are uniformly distributed
Thanks a lot for informing me; the Quickstart section has now been updated.

Best regards, Henrik

JuleanA commented 1 year ago

Can you clarify predict_p and predict_set.

If you have a predict_p that outputs [0.46552707, 0.04407598] - as in your example, shouldn't the predict_set output be [0, 1] (in your example it is [1, 0])? I assume predict_set is predicting the class labels, and thus the second class with a p-value of 0.04407598 is under the 95% confidence interval?

Thank you.

henrikbostrom commented 1 year ago

predict_set provides the labels that cannot be rejected at the chosen confidence level; 1 indicates the presence of the corresponding label in the prediction set, i.e. it has not been rejected.

Best regards, Henrik

lcrmorin commented 1 year ago

@henrikbostrom thanks for your answer. Where I come from 'calibrating a model in probabilities' means changing the output of the model so that it matches historical probabilities. (think probability calibration curve, usually dealt with things like isotonic regression). Maybe this is a cultural thing... as I understand it, it can be translated into p-values calibration.

It seemed to me that conformal prediction would allows something like this. At least the Venn-Abers approach seems to offer something similar, based on the metric used (see discussion). I was wondering 1) if there would be a similar approach here to build an optimal prediction to account for the calculated p-values and 2) if this would depend on the metric used as in the Venn-Abers approach.

Regarding my second question, I was trying to evaluate the performance gain of the calibration process. And tried to compare some random forest performance with and without the wrapper. As the predict_proba method is not modified in the sense I expected, the experiment is a bit void (results depends on the seeds and the vanilla rf having access to the whole train data).

henrikbostrom commented 1 year ago

Thanks for the clarification!

Venn-Abers predictors would indeed be a natural choice for obtaining calibrated class probabilities; these are not (currently) implemented in the package.

When evaluating the output of predict_proba (which is not affected by the calibrate method) one would indeed expect a better performance from fitting the full (rather than only the proper) training set.

Best regards, Henrik

henrikbostrom commented 1 year ago

I will move this thread to "Discussions" (the proposed documentation change has been fixed).

henrikbostrom / crepes

Question and documentation #13