Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
3.1k stars 357 forks source link

样本内预测predict_insample无法使用 #1056

Open JKYtydt opened 4 months ago

JKYtydt commented 4 months ago

What happened + What you expected to happen

无法使用样本内预测,阻止我完成接下来的任务

test.py 75 <module>
Y_hat_insample = nf.predict_insample(step_size=12)

core.py 1213 predict_insample
fcsts[:, col_idx : (col_idx + output_length)] = model_fcsts

ValueError:
could not broadcast input array from shape (17388,1) into shape (17412,1)

Versions / Dependencies

Python 3.9.19 neuralforecast 1.7.3

Reproduction script


train_df = pd.read_csv('ETTh1.csv',encoding='utf-8')
train_df['ds'] = pd.to_datetime(train_df['ds'])
model = iTransformer(h=12,
                     input_size=24,
                     n_series=1,
                     hidden_size=128,
                     n_heads=8,
                     e_layers=2,
                     d_layers=1,
                     d_ff=4,
                     factor=1,
                     dropout=0.1,
                     use_norm=True,
                     loss=MSE(),
                     valid_loss=MSE(),
                     early_stop_patience_steps=3,
                     batch_size=24,
                     max_steps=5)

nf = NeuralForecast(
    models=[model],
    freq='H'
)

nf.fit(df=train_df, val_size=12)

# nf.save(path=save_path,
#         model_index=None,
#         overwrite=True,
#         save_dataset=True)
print('模型保存完成')
Y_hat_insample = nf.predict_insample(step_size=12)
print('Y_hat_insample' ,Y_hat_insample.head(),Y_hat_insample.shape)
Y_hat_insample = pd.DataFrame(Y_hat_insample)

Y_hat_insample['unique_id'] = Y_hat_insample.index
mae = mae(Y_hat_insample,models=['iTransformer'],id_col='unique_id')
mse = mse(Y_hat_insample,models=['iTransformer'],id_col='unique_id')
validation_df = pd.DataFrame(data={'MAE': mae['iTransformer'], 'MSE': mse['iTransformer']})

### Issue Severity

High: It blocks me from completing my task.
yenhochen commented 4 months ago

I am also having the same issue for my own dataset and am not sure why

elephaint commented 4 months ago

Do you also experience this issue with other models? (e.g. TSMixer? or DLinear?)

yenhochen commented 4 months ago

yes, I tried DLinear and NBeatsx with various step_sizes

lastsummerx commented 4 months ago

I have the same problem with predict_insample of iTransformer.

idx = pd.date_range('2020-01-01', '2021-01-01', freq='h', inclusive='left')
data = np.sin(np.arange(idx.size)) + np.random.normal(size=idx.size)
df = pd.DataFrame({'unique_id': [1] * idx.size, 'ds': idx, 'y': data})
horizon = 8
models = [iTransformer(h=horizon, input_size=56, n_series=1)]
nf = NeuralForecast(models=models, freq='h')
nf.fit(df, val_size=horizon)
y_hat_df = nf.predict_insample(step_size=horizon)

Here is the code, and got error.

ValueError                                Traceback (most recent call last)
Cell In[44], line 8
      6 nf = NeuralForecast(models=models, freq='h')
      7 nf.fit(df, val_size=horizon)
----> 8 y_hat_df = nf.predict_insample(step_size=horizon)

File lib\site-packages\neuralforecast\core.py:1213, in NeuralForecast.predict_insample(self, step_size)
   1211 # Append predictions in memory placeholder
   1212 output_length = len(model.loss.output_names)
-> 1213 fcsts[:, col_idx : (col_idx + output_length)] = model_fcsts
   1214 col_idx += output_length
   1215 model.set_test_size(test_size=test_size)  # Set original test_size

ValueError: could not broadcast input array from shape (8728,1) into shape (8784,1)

python 3.10.14 neuralforecast 1.7.3 pytorch 2.2.2 pytorch-lightning 2.3.3 numpy 1.26.4 pandas 2.2.2

lastsummerx commented 4 months ago

I found a PR #715 ,it is similar to this problem. After modifying line 201 to 203 of _base_multivarite.py the issue was resolved. from

predict_step_size = self.predict_step_size
cutoff = -self.input_size - self.test_size
temporal = batch["temporal"][:, :, cutoff:]

to

initial_input = temporal.shape[-1] - self.test_size
if (
    initial_input <= self.input_size
):  # There is not enough data to predict first timestamp
    padder_left = nn.ConstantPad1d(
        padding=(self.input_size - initial_input, 0), value=0
    )
    temporal = padder_left(temporal)
predict_step_size = self.predict_step_size
cutoff = -self.input_size - self.test_size
temporal = temporal[:, :, cutoff:]
elephaint commented 3 months ago

@lastsummerx Thanks for that - makes sense. We already have a PR that implements that, we might do it there

carusyte commented 3 months ago

Same issue here using the latest codebase from main branch. I'm using TimeMixer, though

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[16], [line 1](vscode-notebook-cell:?execution_count=16&line=1)
----> [1](vscode-notebook-cell:?execution_count=16&line=1) pis = nf.predict_insample()

File ~/ProgramData/git/neuralforecast/neuralforecast/core.py:1285, in NeuralForecast.predict_insample(self, step_size)
   [1283](https://file+.vscode-resource.vscode-cdn.net/Users/jx/ProgramData/python/marten/notebooks/TimeMixer/~/ProgramData/git/neuralforecast/neuralforecast/core.py:1283) # Append predictions in memory placeholder
   [1284](https://file+.vscode-resource.vscode-cdn.net/Users/jx/ProgramData/python/marten/notebooks/TimeMixer/~/ProgramData/git/neuralforecast/neuralforecast/core.py:1284) output_length = len(model.loss.output_names)
-> [1285](https://file+.vscode-resource.vscode-cdn.net/Users/jx/ProgramData/python/marten/notebooks/TimeMixer/~/ProgramData/git/neuralforecast/neuralforecast/core.py:1285) fcsts[:, col_idx : (col_idx + output_length)] = model_fcsts
   [1286](https://file+.vscode-resource.vscode-cdn.net/Users/jx/ProgramData/python/marten/notebooks/TimeMixer/~/ProgramData/git/neuralforecast/neuralforecast/core.py:1286) col_idx += output_length
   [1287](https://file+.vscode-resource.vscode-cdn.net/Users/jx/ProgramData/python/marten/notebooks/TimeMixer/~/ProgramData/git/neuralforecast/neuralforecast/core.py:1287) model.set_test_size(test_size=test_size)  # Set original test_size

ValueError: could not broadcast input array from shape (1150,1) into shape (1250,1)
carusyte commented 2 months ago

Hi there, the issue still persists. Would PR #1023 fix it? I notice that the PR might be failing some sanity checks, was that the reason it's still not merged?