Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
3.09k stars 357 forks source link

[<Library component: Model|Core|etc...>] static exogenous variables not found in input dataset #1194

Open saad1912 opened 1 week ago

saad1912 commented 1 week ago

What happened + What you expected to happen

Hi everyone,

I'm running into an error when fitting a time series model, and I'm hoping for some guidance. Here’s the message I'm getting:

Error fitting model for unique_id 002-303|BRP900_4979: {'quarter', 'site_id', 'year', 'month', 'product_id', 'month_of_Q'} static exogenous variables not found in input dataset Error fitting model for unique_id 002-439|BRP900_4979: {'quarter', 'site_id', 'year', 'month', 'product_id', 'month_of_Q'} static exogenous variables not found in input dataset ...

I am getting this error for all the time series I am fitting for.

Versions / Dependencies

from neuralforecast.models import NHITS import pandas as pd import numpy as np from neuralforecast.losses.pytorch import HuberLoss

Reproduction script

nhits_model = NHITS( input_size=training_period, h=forecast_period,

hist_exog_list=[
    'quantity|lag_1', 
    'quantity|lag_2', 
    'quantity|lag_3', 
    'quantity|lag_6', 
    'quantity|lag_9', 
    'quantity|lag_12',
    'quantity|rmean_Q', 
    'quantity|rmean_Y', 
    'quantity|csum',
    'quantity|csum_Y', 
    'quantity|csum_Q'

],
stat_exog_list=['quarter', 'year', 'month_of_Q', 'month', 'site_id','product_id'], 
scaler_type='robust',  
learning_rate=0.001,  
loss=HuberLoss(delta=1.0)  

)

nf = NeuralForecast(models=[nhits_model], freq=time_bound)

all_forecasts = []

for unique_id in train_data_neural['unique_id'].unique():

unique_train_data = train_data_neural[train_data_neural['unique_id'] == unique_id]

if len(unique_train_data) < training_period: 
    print(f"Insufficient training data for unique_id {unique_id}. Skipping this ID.")
    continue

try:

    nf.fit(unique_train_data)

except Exception as e:
    print(f"Error fitting model for unique_id {unique_id}: {e}")
    continue  

unique_test_data = test_data_neural[test_data_neural['unique_id'] == unique_id]

if unique_test_data.empty:
    print(f"No test data available for unique_id {unique_id}. Skipping this ID.")
    continue

try:
    forecasts = nf.predict(unique_test_data)
except Exception as e:
    print(f"Error predicting for unique_id {unique_id}: {e}")
    continue  

if forecasts.empty:
    print(f"No forecasts generated for unique_id {unique_id}. Skipping this ID.")
    continue

forecasts['unique_id'] = unique_id  
all_forecasts.append(forecasts)

if all_forecasts: all_forecasts_df = pd.concat(all_forecasts, ignore_index=True) else: all_forecasts_df = pd.DataFrame()
print("Forecasting completed.")

Issue Severity

High: It blocks me from completing my task.

marcopeix commented 4 days ago

Hello, sorry for the late reply.

To use static exogenous features, you need to pass them as a separate DataFrame. Here's a working example with simulated data:

# Create a dataset with historical exogenous features

n_rows=100
data = pd.DataFrame({
    'unique_id': [1] * n_rows,
    'ds': pd.date_range(start='2023-01-01', periods=n_rows, freq='D'),
    'y': np.random.rand(n_rows),
    'hist_exog1': np.random.rand(n_rows),
    'hist_exog2': np.random.rand(n_rows)
})

# Create a DataFrame with static exogenous features

static_df = pd.DataFrame({
    'unique_id': 1,
    'stat_exog': 3
}, index=range(1))

# Train your model

nhits = NHITS(h=10, 
              input_size=20, 
              hist_exog_list=['hist_exog1', 'hist_exog2'], 
              stat_exog_list=['stat_exog'], 
              max_steps=100)
nf = NeuralForecast(models=[nhits], freq='D')
nf.fit(df=data, static_df=static_df)

Remember that static features are like categories and they must be numbers. The static_df must only have the category label for a unique_id, no need to have it for all time steps.