Closed ogrisel closed 1 year ago
Hey @ogrisel, I went through the tutorial notebook you provided and the scikit-learn example on Time-related Feature Engineering using the bike sharing dataset.
Note that this example probably covers too many thinks and we should probably focus on a trim down version.
Based on the code in the example, I created a trimmed down version of it that makes use of pandas-engineered lagged features and Histogram-based Gradient Boosting Regression Trees for time-series forecasting over HERE.
Can I take up this issue and make a PR to include code from the trimmed down version? We can then cross-link it to the time-related feature example on the same dataset.
@ogrisel I‘m a bit surprised that you merged without any review.
Sorry i did not know you were planning to review. It's true that i was the original author of the source notebook, but since it was a long time ago and it was reworked by an external contributor and reviewed again by arturo and I thought it was enough, especially since there is no impact on the library itself and we can improve it incrementally or completely rework it a posteriori without backward compat concerns.
Sorry also from me, I confused the issue with the PR which as 2 reviews. So everything is fine.
For EuroScipy 2022, I gave a tutorial on how to use pandas-engineered lagged and windowed features for time series forecasting with scikit-learn regressors.
Here is the notebook:
I think it might be worth investing some effort to reuse some of that material to turn it into a tutorialish example for the gallery (and cross-link it with the time-related feature example on the same dataset).
Note that this example probably covers too many thinks and we should probably focus on a trim down version. For instance, by removing the experiment with MAPIE that I don't find particularly conclusive at the moment (would need to speed more time to find how to make MAPIE output heteroscedastic prediction intervals on this data in particular). I find that the discussion on sktime to be informative to go beyond the pure-scikit-learn approach which I found out to be limitting in retrospect as explained towards the end of the tutorial.
/cc @lorentzenchr who expressed interest. Also /cc @ArturoAmorQ who might be interested in working on such a contribution.