deel-ai / puncc

👋 Puncc is a python library for predictive uncertainty quantification using conformal prediction.
https://deel-ai.github.io/puncc/
252 stars 14 forks source link

TypeError: _quantile_dispatcher() got an unexpected keyword argument 'method' #51

Open akshat-suwalka-dream11 opened 4 months ago

akshat-suwalka-dream11 commented 4 months ago

Module

None

Contact Details

No response

Current Behavior

I have two qunatile catboost regressor model but crq.predict(X_test, alpha=0.05) throws an error :- TypeError: _quantile_dispatcher() got an unexpected keyword argument 'method'

Wrap models in predictor

predictor = DualPredictor(models=[reg_low, reg_high])

CP method initialization

crq = CQR(predictor)

The call to fit trains the model and computes the nonconformity

scores on the calibration set

crq.fit(X_fit=X_train, y_fit=y_train, X_calib=X_valid, y_calib=y_valid)

The predict method infers prediction intervals with respect to

the significance level alpha = 20%

y_pred, y_pred_lower, y_pred_upper = crq.predict(X_test, alpha=0.05)

Compute marginal coverage and average width of the prediction intervals

coverage = regression_mean_coverage(y_test, y_pred_lower, y_pred_upper) width = regression_sharpness(y_pred_lower=y_pred_lower, y_pred_upper=y_pred_upper) print(f"Marginal coverage: {np.round(coverage, 2)}") print(f"Average width: {np.round(width, 2)}")

Expected Behavior

It should run

Version

v0.9

Environment

- OS:
- Python version:
- Packages used version:

Relevant log output

No response

To Reproduce

param = {'loss_function': 'Quantile:alpha=0.05', 'learning_rate': 0.4607417710785185, 'l2_leaf_reg': 0.03572230525884548, 'depth': 4, 'boosting_type': 'Plain', 'bootstrap_type': 'MVS', 'min_data_in_leaf': 8} reg_left = CatBoostRegressor(task_type="GPU", devices='-1', param) param = {'loss_function': 'Quantile:alpha=0.95', 'learning_rate': 0.002097382718709981, 'l2_leaf_reg': 0.07411180923916862, 'depth': 1, 'boosting_type': 'Plain', 'bootstrap_type': 'Bayesian', 'min_data_in_leaf': 5, 'bagging_temperature': 9.119533192831474} reg_high = CatBoostRegressor(task_type="GPU", devices='-1', param)

Wrap models in predictor

predictor = DualPredictor(models=[reg_low, reg_high])

CP method initialization

crq = CQR(predictor)

The call to fit trains the model and computes the nonconformity

scores on the calibration set

crq.fit(X_fit=X_train, y_fit=y_train, X_calib=X_valid, y_calib=y_valid)

The predict method infers prediction intervals with respect to

the significance level alpha = 20%

y_pred, y_pred_lower, y_pred_upper = crq.predict(X_test, alpha=0.05)

Compute marginal coverage and average width of the prediction intervals

coverage = regression_mean_coverage(y_test, y_pred_lower, y_pred_upper) width = regression_sharpness(y_pred_lower=y_pred_lower, y_pred_upper=y_pred_upper) print(f"Marginal coverage: {np.round(coverage, 2)}") print(f"Average width: {np.round(width, 2)}")

M-Mouhcine commented 4 months ago

Hi @akshat-suwalka-dream11,

Thanks for the post.

I'm not able to recreate the error. I've tried with synthetic data and your code works fine (see below). Can you give me some pointers to reproduce the problem ?

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from catboost import CatBoostRegressor
import numpy as np
from deel.puncc.api.prediction import DualPredictor
from deel.puncc.regression import CQR
from deel.puncc.metrics import regression_mean_coverage, regression_sharpness

# Generate a random regression problem
X, y = make_regression(
    n_samples=1000, n_features=4, n_informative=2, random_state=0, shuffle=False
)

# Split data into train and test
X, X_test, y, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Split train data into fit and calibration
X_fit, X_calib, y_fit, y_calib = train_test_split(
    X, y, test_size=0.2, random_state=0
)

# reg_low catboost model
param = {
    "loss_function": "Quantile:alpha=0.05",
    "learning_rate": 0.4607417710785185,
    "l2_leaf_reg": 0.03572230525884548,
    "depth": 4,
    "boosting_type": "Plain",
    "bootstrap_type": "MVS",
    "min_data_in_leaf": 8,
}
reg_low = CatBoostRegressor(**param)

# reg_high catboost parameters
param = {
    "loss_function": "Quantile:alpha=0.95",
    "learning_rate": 0.002097382718709981,
    "l2_leaf_reg": 0.07411180923916862,
    "depth": 1,
    "boosting_type": "Plain",
    "bootstrap_type": "Bayesian",
    "min_data_in_leaf": 5,
    "bagging_temperature": 9.119533192831474,
}
reg_high = CatBoostRegressor(**param)

# Dual predictor definition
predictor = DualPredictor(models=[reg_low, reg_high])

# Initialization of CQR conformalizer
crq = CQR(predictor)

# Fitting/calibration
crq.fit(X_fit=X_fit, y_fit=y_fit, X_calib=X_calib, y_calib=y_calib)

# Conformal prediction for alpha = 5%
y_pred, y_pred_lower, y_pred_upper = crq.predict(X_test, alpha=0.05)

# Results
coverage = regression_mean_coverage(y_test, y_pred_lower, y_pred_upper)
width = regression_sharpness(y_pred_lower=y_pred_lower,
y_pred_upper=y_pred_upper)
print(f"Marginal coverage: {np.round(coverage, 2)}")
print(f"Average width: {np.round(width, 2)}")