feat: Added cross-validation tutorial

MMenchero commented 7 months ago

This is part of the new docs for neuralforecast.

review-notebook-app[bot] commented 7 months ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

CLAassistant commented 7 months ago

All committers have signed the CLA.

CLAassistant commented 7 months ago

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:38Z ----------------------------------------------------------------

typo in category (says catefory). also please use the term timestamp for ds, I think some errors refer to that as timestamps

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:39Z ----------------------------------------------------------------

Line #2.    StatsForecast.plot(Y_df)

This method calls utilsforecast.plotting.plot_series under the hood. Also statsforecast is not a dependency of neural (utils is). Can you please replace this with the plot_series function?

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:40Z ----------------------------------------------------------------

I think we could explain with more detail here how the process works, since it's a bit different with respect to the other libs. The models are trained only once and are used to generate predictions over several windows. The refit argument (which is on the main branch but hasn't been released) can control the retraining behavior (it currently defaults to False) but can be an integer or True

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:41Z ----------------------------------------------------------------

The 80 and 90% intervals are because those are the defaults of the MQLoss, I think we should clarify that.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-22T15:52:42Z ----------------------------------------------------------------

Please use the following here:

from utilsforecast.evaluation import evaluate
from utilsforecast.losses import rmse

The evaluate function is very similar to accuracy and we're going to remove the metrics module from datasetsforecast in the next release.

MMenchero commented on 2024-02-28T02:47:09Z ----------------------------------------------------------------

All suggested changes have been implemented.

MMenchero commented 7 months ago

All suggested changes have been implemented.

View entire conversation on ReviewNB

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-28T16:58:28Z ----------------------------------------------------------------

Should be utilsforecast at the end here

MMenchero commented on 2024-02-29T03:33:22Z ----------------------------------------------------------------

thanks, missed that one

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-28T16:58:29Z ----------------------------------------------------------------

Line #1.    cv_df = cv_df.reset_index()

I think it'd be better to set os.environ['NIXTLA_ID_AS_COL'] = '1' at the top of the notebook to avoid having to do this

_MMenchero commented on 2024-02-29T03:33:50Z_ ----------------------------------------------------------------

TIL, that's pretty useful

MMenchero commented 7 months ago

thanks, missed that one

View entire conversation on ReviewNB

MMenchero commented 7 months ago

TIL, that's pretty useful

View entire conversation on ReviewNB

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-29T15:27:21Z ----------------------------------------------------------------

Can you please set step_size=horizon here so that there's only one prediction per timestamp? Otherwise the plot at the bottom looks weird because there are several predictions for each timestamp. Also feel free to open an issue in utilsforecast to support plotting the results from CV if you think it would be useful.

MMenchero commented on 2024-03-01T01:58:45Z ----------------------------------------------------------------

Got it, and I'll open the issue in utilsforecast since I think it'll be very useful.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

jmoralez commented on 2024-02-29T15:27:22Z ----------------------------------------------------------------

I missed the last part here, we should remove it (~~We also need to reset its index to include the column with the unique ids~~)

MMenchero commented on 2024-03-01T01:59:26Z ----------------------------------------------------------------

I also missed it, but is now deleted.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:12Z ----------------------------------------------------------------

Can we remove this equation? to keep the simpler. Maybe do a callout box (see other tutorials) with this information and the link to more details

MMenchero commented on 2024-03-01T04:31:53Z ----------------------------------------------------------------

sure, I added a callout box and kept it simple in the main text.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:13Z ----------------------------------------------------------------

Sorry for not saying it earlier, but I strongly suggest to switch to NHITS and LSTM. The Auto models add a layer of complexity, because we are not explaining the validation set. We can use the NHTIS and LSTM with their default parameters as well

MMenchero commented on 2024-03-01T04:36:51Z ----------------------------------------------------------------

I dropped the auto models, but in order to keep Capi's suggestion of setting step_size=horizon so that we only have one prediction per timestamp, I had to remove the LSTM since recurrent models don't support step_size>1. Instead, I'm using the MLP and the NBEATS models.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:14Z ----------------------------------------------------------------

Can we add the plot with sliding windows on the predict insample tutorial? And the explanation above taking it from the predict insample tutorial as well:

With the step_size parameter you can specify the step size between consecutive windows to produce the forecasts. In this example we will set step_size=horizon to produce non-overlapping forecasts.

The following diagram shows how the forecasts are produced based on the step_size parameter and h (horizon) of the model. In the diagram we set step_size=2 and h=4.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:15Z ----------------------------------------------------------------

I think we need an additional paragraph or even a plot. The main goal of the tutorial is explaining cross-validation, so we need to provide all the possible details and explanation. For example:

With the current parameters, cross validation will look like this: (and add a plot, with an example of a time series with the three windows at the end. And mention that the model will be trained with info prior to x timestamp, and so on.

MMenchero commented on 2024-03-01T06:44:17Z ----------------------------------------------------------------

I added an additional paragraph + plots at the end to further clarify the concept. It's at the end because I need to rename a column to make the plots and that is done in section 5.

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

cchallu commented on 2024-02-29T19:54:15Z ----------------------------------------------------------------

same as before, remove equation

MMenchero commented 7 months ago

got it.

View entire conversation on ReviewNB

MMenchero commented 7 months ago

I also missed it, but is now deleted.

View entire conversation on ReviewNB

MMenchero commented 7 months ago

sure, I added a callout box and kept it simple in the main text.

View entire conversation on ReviewNB

MMenchero commented 7 months ago

I'm now using the default values, but in order to keep Capi's suggestion of setting step_size=horizon, I had to remove the LSTM model since recurrent models don't support step_size>1. Instead, I'm now using the MLP and the NBEATS models.

View entire conversation on ReviewNB

MMenchero commented 7 months ago

I added an additional paragraph + plots at the end to further clarify the concept. It's at the end because I need to rename a column to make the plots and that is done in section 5.

View entire conversation on ReviewNB

Nixtla / neuralforecast

feat: Added cross-validation tutorial #897