Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
2.98k stars 342 forks source link

HINT Predictions Not Created with nf.fit() and nf.predict() but with nf.cross_validation() #1125

Open breadwall opened 1 month ago

breadwall commented 1 month ago

What happened + What you expected to happen

I walked through the tutorial and it doesn't produce columns that are labeled as HINT when fitting then predicting. Therefore, when you get to the step to plot it errors out.

Additionally, since they're not HINT predictions but the model chosen (NHITS or TFT), they are not reconciled together. https://nixtlaverse.nixtla.io/neuralforecast/docs/tutorials/hierarchical_forecasting.html or https://github.com/Nixtla/neuralforecast/blob/main/nbs/docs/tutorials/09_hierarchical_forecasting.ipynb

However, when I follow the two test cases in the source code which uses nf.cross_validation() it does produce HINT columns. These HINT predictions are reconciled with sums that are the same across hierarchy levels and combinations. https://github.com/Nixtla/neuralforecast/blob/main/nbs/models.hint.ipynb

Versions / Dependencies

neuralforecast==1.7.4 python 3.10.7

Reproduction script

Using Code from https://github.com/Nixtla/neuralforecast/blob/main/nbs/models.hint.ipynb

from neuralforecast import NeuralForecast
from neuralforecast.models import NHITS, HINT
import pandas as pd

from typing import Optional

import numpy as np
import torch

def sort_df_hier(Y_df, S_df):
    # NeuralForecast core, sorts unique_id lexicographically
    # by default, this class matches S_df and Y_hat_df order.    
    Y_df.unique_id = Y_df.unique_id.astype('category')
    Y_df.unique_id = Y_df.unique_id.cat.set_categories(S_df.index)
    Y_df = Y_df.sort_values(by=['unique_id', 'ds'])
    return Y_df

# -----Create synthetic dataset-----
np.random.seed(123)
train_steps = 20
num_levels = 7
level = np.arange(0, 100, 0.1)
qs = [[50-lv/2, 50+lv/2] for lv in level]
quantiles = np.sort(np.concatenate(qs)/100)

levels = ['Top', 'Mid1', 'Mid2', 'Bottom1', 'Bottom2', 'Bottom3', 'Bottom4']
unique_ids = np.repeat(levels, train_steps)

S = np.array([[1., 1., 1., 1.],
              [1., 1., 0., 0.],
              [0., 0., 1., 1.],
              [1., 0., 0., 0.],
              [0., 1., 0., 0.],
              [0., 0., 1., 0.],
              [0., 0., 0., 1.]])

S_dict = {col: S[:, i] for i, col in enumerate(levels[3:])}
S_df = pd.DataFrame(S_dict, index=levels)

ds = pd.date_range(start='2018-03-31', periods=train_steps, freq='Q').tolist() * num_levels
# Create Y_df
y_lists = [S @ np.random.uniform(low=100, high=500, size=4) for i in range(train_steps)]
y = [elem for tup in zip(*y_lists) for elem in tup]
Y_df = pd.DataFrame({'unique_id': unique_ids, 'ds': ds, 'y': y})
Y_df = sort_df_hier(Y_df, S_df)

# ------Fit/Predict HINT Model------
# Model + Distribution + Reconciliation
nhits = NHITS(h=4,
              input_size=4,
              loss=GMM(n_components=2, quantiles=quantiles, num_samples=len(quantiles)),
              max_steps=5,
              early_stop_patience_steps=2,
              val_check_steps=1,
              scaler_type='robust',
              learning_rate=1e-3)
model = HINT(h=4, model=nhits, S=S, reconciliation='BottomUp')

# Fit and Predict
nf = NeuralForecast(models=[model], freq='Q')
#forecasts = nf.cross_validation(df=Y_df, val_size=4, n_windows=1)

## ----- Using fit & predict ----
nf.fit(df=Y_df, val_size=4)
forecasts = nf.predict()

## ----- No HINT columns just NHITS
print(forecasts.columns)

## ----- Rename Column to HINT to show no coherence
forecasts = forecasts.rename(columns={'NHITS':'HINT'})

# ---Check Hierarchical Coherence---
parent_children_dict = {0: [1, 2], 1: [3, 4], 2: [5, 6]}
# check coherence for each horizon time step
for _, df in forecasts.groupby('ds'):
    hint_mean = df['HINT'].values
    for parent_idx, children_list in parent_children_dict.items():
        parent_value = hint_mean[parent_idx]
        children_sum = hint_mean[children_list].sum()
        np.testing.assert_allclose(children_sum, parent_value, rtol=1e-6)

Issue Severity

Medium:

breadwall commented 1 month ago

I've found that if I go back to neuralforecast version 1.6.1

Then re-add the changes to hint.py from this commit: https://github.com/Nixtla/neuralforecast/pull/964/files

I get properly reconciled forecasts using fit then predict with the HINT column produced.