Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
2.98k stars 342 forks source link

What is the purpose of unique_id during prediction? Different unique_id values produce the same prediction results. #1162

Closed sunggc closed 4 days ago

sunggc commented 5 days ago

What happened + What you expected to happen

nf = NeuralForecast(models=models, freq='M') nf.fit(df=Y_df)

df_HUFL = Y_df[Y_df['unique_id'] == 'HUFL'].tail(48)

preds = nf.predict(df_HUFL) print(preds.head(12)) df_HULL = df_HUFL.copy() df_HULL['unique_id'] = 'HULL'

preds_HULL = nf.predict(df_HULL) print(preds_HULL.head(12))

preds is the same as preds_HULL

Versions / Dependencies

1.7.5

Reproduction script

from datasetsforecast.long_horizon import LongHorizon

Change this to your own data to try the model

Ydf, , _ = LongHorizon.load(directory='./', group='ETTm2') print(Y_df.head())

打印Y_df 中不同的unique_id

print(Y_df['unique_id'].unique()) horizon = 12 Y_df['ds'] = pd.to_datetime(Y_df['ds'])

Try different hyperparmeters to improve accuracy.

models = [LSTM(h=horizon, # Forecast horizon max_steps=100, # Number of steps to train scaler_type='standard', # Type of scaler to normalize data encoder_hidden_size=64, # Defines the size of the hidden state of the LSTM decoder_hidden_size=64,), # Defines the number of hidden units of each layer of the MLP decoder NHITS(h=horizon, # Forecast horizon input_size=2 * horizon, # Length of input sequence max_steps=100, # Number of steps to train n_freq_downsample=[2, 1, 1]) # Downsampling factors for each stack output ] nf = NeuralForecast(models=models, freq='M') nf.fit(df=Y_df)

df_HUFL = Y_df[Y_df['unique_id'] == 'HUFL'].tail(48)

preds = nf.predict(df_HUFL) print(preds.head(12)) df_HULL = df_HUFL.copy() df_HULL['unique_id'] = 'HULL'

preds_HULL = nf.predict(df_HULL) print(preds_HULL.head(12))

Issue Severity

None

elephaint commented 4 days ago

Thanks for using NeuralForecast!

As stated in the documentation: Y_df is a dataframe with three columns: unique_id with a unique identifier for each time series, a column ds with the datestamp and a column y with the values of the series.

With neural models you typically train a single model for multiple time series. If you only have a single time series, the value of unique_id will not matter.