unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
7.91k stars 857 forks source link

Darts Examples- NBeats #573

Closed akbaramed closed 2 years ago

akbaramed commented 2 years ago

Hello,

Wanting to get some clarification on the example page of NBeats (https://unit8co.github.io/darts/examples/08-NBEATS-examples.html)

I am using stocks data and following steps from the example for better understanding.

Observation: Cell 5, you have scaled first by imputing missing values and then in Cell 6 split the series into train and val after a certain date. Questions wont this introduce Data Leakage (shouldn't we split first and then scale).

Can the community please let me know their views.

My work around ########### series = TimeSeries.fromdataframe(df, 'Date','Adj Close') scaler = Scaler() train = scaler.fit_transform(series[:-15]) val = scaler.transform(series[-15:]) ############

Then in cell 9 , where "series" variable is provided, I apply below logic to make that happen ############# series_ = pd.concat([train.pd_series().reset_index(), val.pd_series().resetindex()]) series.columns = ['Date','Adj Close'] series_ = TimeSeries.fromdataframe(series,'Date','Adj Close') ############

FYI there is a difference in the MAPE values of the 2 approaches.

Any comments suggestions/explanations are welcome

Thanks

dennisbader commented 2 years ago

Hi @akbaramed and thanks for writing. You are correct about the scaling. The example notebooks on forecasting models are primarily meant to show how you can use our models. We sometimes use shortcuts which ignore best-practices on data processing.

@hrzn we should maybe consider being more concise there as it's not the first time someone mentions this.

akbaramed commented 2 years ago

Very grateful for you comment, closing issue.