Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
2.98k stars 342 forks source link

TSMixerx model batch size mismatch if number of unique_ids > 1024 #948

Closed Ansgineo closed 5 months ago

Ansgineo commented 6 months ago

What happened + What you expected to happen

First of all thank you so much for implementing the TSMixer models! I ran into an issue when training the TSMixerx model on a custom dataset. The error occurs as soon as the number of unique ids in the data exceeds 1024. Then I get this message

RuntimeError: The size of tensor a (1024) must match the size of tensor b (1463) at non-singleton dimension 3

I assume this is a threshold where the data gets batched automatically and does not fit with the model dimensions anymore. One can theoretically set the n_series parameter to 1024 but then the last batch throws an error when training or predicting. So this would only work if you number of time series is coincidentally divisible by 1024.

I provide a code example, where you can clone the Airpassengers example up to this threshold in order to reproduce the error.

Versions / Dependencies

I am using python 3.10 and installed the latest version of neuralforecast last week directly from github. I am also using torch 2.2 with 12.1 cuda.

Reproduction script

import pandas as pd

from neuralforecast import NeuralForecast from neuralforecast.utils import AirPassengersPanel, AirPassengersStatic from neuralforecast.losses.pytorch import MAE from neuralforecast.models import TSMixerx

Y_train_df = AirPassengersPanel[ AirPassengersPanel.ds < AirPassengersPanel["ds"].values[-12] ].reset_index(drop=True) # 132 train Y_test_df = AirPassengersPanel[ AirPassengersPanel.ds >= AirPassengersPanel["ds"].values[-12] ].reset_index(drop=True) # 12 test

def inflate_df(df_in, multiplier=513): """Function to inflate the dataset by creating multiple series with the same data. Since there are two series in the in the original Airpassengers set, we will create 2*multipliers series with the same data.""" df_list = [] for id in range(multiplier): df = df_in.copy(deep=True) df["unique_id"] += str(id) df_list.append(df) return pd.concat(df_list, ignore_index=True).reset_index(drop=True)

Y_train_df = inflate_df(Y_train_df) n_series = Y_train_df["unique_id"].nunique() print(f"Number of series: {n_series}")

model = TSMixerx( h=12, input_size=24, n_series=n_series,

stat_exog_list=['airline1'],

futr_exog_list=["trend"],
hist_exog_list=["y_[lag12]"],
n_block=4,
ff_dim=3,
revin=True,
scaler_type="standard",
max_steps=200,
early_stop_patience_steps=-1,
val_check_steps=5,
learning_rate=1e-3,
loss=MAE(),
valid_loss=MAE(),
batch_size=32,

)

fcst = NeuralForecast(models=[model], freq="ME") fcst.fit(df=Y_train_df, static_df=None, val_size=12) forecasts = fcst.predict(futr_df=Y_test_df)

Issue Severity

High: It blocks me from completing my task.

afogarty85 commented 5 months ago

Nice summary -- also looking forward to this fix!

elephaint commented 5 months ago

Thanks for reporting, I'll have a look this week!

elephaint commented 5 months ago

@Ansgineo So far it seems the issue is within the validation step; Removing val_size=12 and your example will sucessfully fit.

Not a proper solution, but that may help you going already.

Ansgineo commented 5 months ago

Thank you so much for working on this! Adding Y_test_df = inflate_df(Y_test_df) and removing val_size=12 it does indeed fit but fails when predicting

elephaint commented 5 months ago

Thanks - I think I found the culprit and have a fix proposed #962