superlinear-ai / conformal-tights

👖 Conformal Tights adds conformal prediction of coherent quantiles and intervals to any scikit-learn regressor or Darts forecaster
MIT License
86 stars 3 forks source link

TypeError: unsupported operand type(s) for |: 'NoneType' and 'NoneType' #15

Closed phoitack closed 6 months ago

phoitack commented 6 months ago

Hi there. I am trying to run the regression example but I when I try to load the library below

from conformal_tights import ConformalCoherentQuantileRegressor

I get the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-2-ee841138eade>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from conformal_tights import ConformalCoherentQuantileRegressor
      2 from sklearn.datasets import fetch_openml
      3 from sklearn.model_selection import train_test_split
      4 from xgboost import XGBRegressor

2 frames
[/usr/local/lib/python3.10/dist-packages/conformal_tights/_darts_forecaster.py](https://localhost:8080/#) in DartsForecaster()
     79         *,
     80         # Default darts.models.RegressionModel parameters.
---> 81         lags: LAGS_TYPE | None = None,
     82         lags_past_covariates: LAGS_TYPE | None = None,
     83         lags_future_covariates: FUTURE_LAGS_TYPE | None = None,

TypeError: unsupported operand type(s) for |: 'NoneType' and 'NoneType'

Not sure why it is trying to load the DartsForecaster here. Any suggestions? Thanks.

~ C

lsorber commented 6 months ago

Hi, thanks for reporting this! Darts should be an optional dependency, but it looks like there’s an issue with that. As a workaround you can pip install darts, but we’ll address the issue with a new release today.

lsorber commented 6 months ago

I released a hotfix that should address the issue. We'll add more thorough testing later. Could you check if v0.3.1 solves the issue for you? Thank you!

lsorber commented 6 months ago

Update: also added tests with #17 to ensure that we don't inadvertently depend on optional dependencies in the future.

phoitack commented 6 months ago

Hi Laurent. It works now! Thank you.

BTW, is XGBoost the only regressor that it can use? Can it use CatBoost, or LightGBM?

I want to perform KFold cross validation and hyperparameter tuning during the training stage. Is there a pipeline for this? Any suggestions are welcome.

Please keep up the great work. This library is awesome.

lsorber commented 6 months ago

BTW, is XGBoost the only regressor that it can use? Can it use CatBoost, or LightGBM?

You can use any scikit-learn compatible regressor you like, including CatBoost and LightGBM, yes. We'll try to make that more clear in the README.

I want to perform KFold cross validation and hyperparameter tuning during the training stage. Is there a pipeline for this? Any suggestions are welcome.

Which hyperparameters do you want to train? I'll see if I can add some examples.

phoitack commented 6 months ago

Actually, thinking about it, I could just do the Hyperparameter tuning with cross-validation and pick the best estimator based on the best parameters. Then I would feed the best estimator into 'ConformalCoherentQuantileRegressor'. I think that should work.

BTW, how do I change the size of the calibration set or is this already optimized?

lsorber commented 6 months ago

Actually, thinking about it, I could just do the Hyperparameter tuning with cross-validation and pick the best estimator based on the best parameters. Then I would feed the best estimator into 'ConformalCoherentQuantileRegressor'. I think that should work.

Yes, that should work well. Note that an upcoming release of Conformal Tights should decouple the wrapped estimator entirely, allowing it to benefit from the full training data set without the calibration set affecting its performance!

BTW, how do I change the size of the calibration set or is this already optimized?

You can change the size of the calibration set when constructing the ConformalCoherentQuantileRegressor as follows:

conformal_predictor = ConformalCoherentQuantileRegressor(conformal_calibration_size=(0.3, 1440))

With that setting, it will use 30% of the training data or 1440 samples (whichever is smaller).

These are the trade-offs:

  1. Smaller calibration size means:
    • Tighter prediction intervals and quantiles because more data is available for quantile regression.
    • Less accurate predicted quantiles and less reliable coverage on predicted intervals because less data is available for calibration.
  2. Larger calibration size means:
    • Wider prediction intervals and quantiles because less data is available for quantile regression.
    • More accurate predicted quantiles and more reliable coverage on predicted intervals because more data is available for calibration.
    • More time spent making predict_interval and predict_quantile produce coherent quantiles. This could be prohibitively expensive beyond 2000 samples, depending on the number of quantiles you want to predict.
phoitack commented 6 months ago

Thanks for the quick response, and I apologize for the late response. It's good to hear about features for the upcoming release.

I am currently using this library for a client, and they are interested in getting quantiles and a PDF. Will future releases also include PDFs?

Yes, predicting the quantile and intervals does take time. I wonder if this can be sped up via a GPU. However, I think this is ML library-dependent.