Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
3.12k stars 359 forks source link

[Core] Cannot load models saved using versions before 1.7 #1206

Closed tylernisonoff closed 6 days ago

tylernisonoff commented 1 week ago

What happened + What you expected to happen

Breaking changes to what is expected to be saved in config_dict for a given model means that models saved with neuralforecast versions before 1.7 cannot be loaded after upgrading the 1.7+

This makes upgrading difficult, as we have many production models who's config_dicts are now incorrect.

It seems that at least the following keys are expected to be present in config_dict, but previously were not being saved there:

["local_scaler_type", "id_col", "time_col", "target_col"]

It would be ideal if:

Versions / Dependencies

Confirmed issue upgrading from 1.6.4 -> 1.7.5

python, os, other libraries should not be relevant, but happy to provide.

Reproduction script

using a pre 1.7 version (such as 1.6.4)

from neuralforecast import NeuralForecast
from neuralforecast.models import NBEATS
from neuralforecast.utils import AirPassengersDF

nf = NeuralForecast(
    models = [NBEATS(input_size=24, h=12, max_steps=100)],
    freq = 'M'
)

nf.fit(df=AirPassengersDF)
nf.save('/tmp/model_1.6')

Now upgrade to a 1.7 version (such as 1.7.5)

from neuralforecast import NeuralForecast
nf_loaded = NeuralForecast.load('/tmp/model_1.6')

will error with:

t/core.py:1543, in NeuralForecast.load(path, verbose, **kwargs)
   1537     raise Exception("No configuration found in directory.")
   1539 # Create NeuralForecast object
   1540 neuralforecast = NeuralForecast(
   1541     models=models,
   1542     freq=config_dict["freq"],
-> 1543     local_scaler_type=config_dict["local_scaler_type"],
   1544 )
   1546 for attr in ["id_col", "time_col", "target_col"]:
   1547     setattr(neuralforecast, attr, config_dict[attr]

   KeyError: 'local_scaler_type'

However, it will also fail to find ["id_col", "time_col", "target_col"] which are then read in on the next few lines.

Issue Severity

Medium: It is a significant difficulty but I can work around it.

tylernisonoff commented 1 week ago

After experimenting a bit, it seems that we'd want to the following defaults to be set if they cannot be found in the config:

1.6.3 and previous

defaults = {
    "id_col": "unique_id", 
    "time_col": "ds",
    "target_col": "y",
    "local_scaler_type": None, 
    "scalers_": None
}

for 1.6.4, local_scalertype and scalers were added to the dataset, so we'd probably want to see of those attrs are available on the pickled dataset, and if so, take those as the defaults. Thats a bit gnarly though.

I may try to take a stab a PR for this, although I need to see if I can get the developer environment working properly.

tylernisonoff commented 1 week ago

Attempted fix: https://github.com/Nixtla/neuralforecast/pull/1207