scikit-learn-contrib / MAPIE

A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
https://mapie.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1.3k stars 110 forks source link

Classification coverage (cumulated_sore) is not 1 - alpha #236

Closed christophM closed 1 year ago

christophM commented 1 year ago

Description

The classification coverage for method="cumulated_score" doesn't match $1 - \alpha$ for options include_last_label=False and include_last_label=True.

To Reproduce Steps to reproduce the behavior: Got to this COLAB notebook or see code here:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from mapie.metrics import (
    classification_coverage_score, classification_mean_width_score
)
from mapie.classification import MapieClassifier
X,y = make_classification(n_samples=10000, n_classes=10, n_informative=5, random_state=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=2000, random_state=1)
X_calib, X_new, y_calib, y_new = train_test_split(X_test, y_test, test_size=1000, random_state=42)
model = RandomForestClassifier(random_state=0)
model.fit(X_train, y_train)

alpha = 0.5
mapie_score = MapieClassifier(estimator=model, cv="prefit", method="cumulated_score")
mapie_score.fit(X_calib, y_calib)
y_pred_score, y_ps_score = mapie_score.predict(X_new, alpha=alpha, include_last_label='randomized')
classification_coverage_score(y_new, y_ps_score[:,:,0])

For the 'randomized' option the method produces around the expected coverage: 0.526

But for include_last_label=True:

y_pred_score, y_ps_score = mapie_score.predict(X_new, alpha=alpha, include_last_label=True)
classification_coverage_score(y_new, y_ps_score[:,:,0])

The coverage is much higher 0.752

And also for include_last_label=False:

y_pred_score, y_ps_score = mapie_score.predict(X_new, alpha=alpha, include_last_label=False)
classification_coverage_score(y_new, y_ps_score[:,:,0])

The result coverage is 0.653

Expected behavior I'd expect the coverage for "include_last_label=randomized" to be bounded between "include_last_label=False" and "include_last_label=True". Instead, False/True options have a way too high coverage.

So I'd expect for the coverages: False < randomized < True But the result was 0.653 0.526 0.752

Additional context

vincentblot28 commented 1 year ago

Hello @christophM,

I think this behavior is actually normal, let me explain:

Please tell me if it is clear enough for you

Vincent

christophM commented 1 year ago

Hi Vincent

It makes sense to me now.

I really appreciate that you responded so quickly and took the time to explain it. I learned something new today.

Would it make sense to add a sentence to the documentation of MapieClassifier.predict: Options False and True can result in coverages larger than $1-\alpha$, see [1] [1] Angelopoulos, Anastasios, et al. "Uncertainty sets for image classifiers using conformal prediction." arXiv preprint arXiv:2009.14193 (2020).

Again, thanks a lot, I really appreciate not only your answer but all of the team's effort to build and maintain MAPIE.

Best, Christoph