deel-ai / puncc

👋 Puncc is a python library for predictive uncertainty quantification using conformal prediction.
https://deel-ai.github.io/puncc/
297 stars 15 forks source link

All MAD predictions should be positive. #50

Closed JulianHidalgo closed 7 months ago

JulianHidalgo commented 8 months ago

Hi!

Thank you for creating Puncc. I'm trying to use LocallyAdaptiveCP as described here https://deel-ai.github.io/puncc/regression.html#deel.puncc.regression.LocallyAdaptiveCP

                mu_model = xgb.XGBRegressor()
                sigma_model = xgb.XGBRegressor()
                # Wrap models in a mean/variance predictor
                mean_var_predictor = MeanVarPredictor(
                    models=[mu_model, sigma_model]
                )
                cp = LocallyAdaptiveCP(mean_var_predictor)
                cp.fit(X_fit=X_train, y_fit=y_train, X_calib=X_test, y_calib=y_test)

But I get an error: All MAD predictions should be positive. Any idea of what am I missing? I think the error comes from https://github.com/deel-ai/puncc/blob/6e0a8f8b97f903484748028fb47c205ae4ef46eb/deel/puncc/api/nonconformity_scores.py#L248

mean_absolute_deviation = absolute_difference(y_pred, y_true)
if np.any(sigma_pred < 0):
    raise RuntimeError("All MAD predictions should be positive.")
return mean_absolute_deviation / (sigma_pred + EPSILON)

But I don't know how to avoid it. Any pointers would be greatly appreciated!

M-Mouhcine commented 8 months ago

Hi @JulianHidalgo !

Thanks for opening this issue. I could indeed reproduce the error when using xgboost models with LocallyAdaptiveCP.

Actually, sigma_model is trained to predict the absolute residual $|y-\mu(X)|$, such that $\mu$ is the trained mu_model and $X$ and $y$ are respectively a feature and associated target. The output of sigma_model should be positive, otherwise is messes up with the conformal prediction algorithm. However, in your case, some of such values are negative, which is not allowed.

I've noticed that this behavior happens when the number of estimators n_estimators of the xgboost model is high (by default, it is 100). I've tried using lower values, for example 5 or10, and it works fine:

mu_model = xgb.XGBRegressor()
sigma_model = xgb.XGBRegressor(n_estimators=5)
# Wrap models in a mean/variance predictor
mean_var_predictor = MeanVarPredictor(
    models=[mu_model, sigma_model]
)
cp = LocallyAdaptiveCP(mean_var_predictor)
cp.fit(X_fit=X_train, y_fit=y_train, X_calib=X_test, y_calib=y_test)

Can you see if that works for you ?

PS: we will look into a suitable solution to "correct" models that predict negative values. We could simply force the absolute value of sigma_model predictions, but we will explore more options and pick the least problematic.

JulianHidalgo commented 8 months ago

Thanks for checking this out! Reducing the number of estimator helps, but it also decreases the accuracy of the model and it's not reliable: the same number of estimators works fine with a dataset and fails with another. I noticed LightGBM also generates negative values some times, but less often than XGBoost. At least now I know it's not something in the way I'm using the library or my datasets in particular. I will be watching the issue, thank you again!

jdalch commented 8 months ago

Hello @JulianHidalgo, thanks again for using PUNCC and for rising this issue! After having discussed it with the team, we have decided to take the following steps to fix this issue:

  1. Increase the value of the threshold EPSILON in the scaled_ad nonconformity score.
  2. Add the threshold EPSILON to the scaled_interval prediction set.
  3. Modify the scaled_ad nonconformity score: compute residuals only for calibration points such that sigma + EPSILON > 0, and return warning that some calibration data is not used.
  4. Modify the scaled_interval prediction set: return an infinite sized prediction set if sigma + ESPILON <= 0, and return a warning.

We hope this fixes your issue for you. We expect that negative values of sigma are rare and that our procedure does not have a big impact in the size of the prediction sets. Of course, the probabilistic guarantees given by conformal prediction will remain true after this modification. #

JulianHidalgo commented 8 months ago

Hey @jdalch! Thank you so much to you and the team for designing a solution 😊.

M-Mouhcine commented 7 months ago

Hey @JulianHidalgo,

@jdalch has implemented his solution to address the problem. Could you please test it and let us know if it works?