winedarksea / AutoTS

Automated Time Series Forecasting
MIT License
1.05k stars 96 forks source link

Error on HolidayTransformer since 0.6.12 #242

Open emobs opened 1 month ago

emobs commented 1 month ago

Hi Colin,

Since the update to AutoTS 0.6.12 I'm getting this error while forecasting on existing models that were working fine before:

/lib/python3.10/site-packages/numpy/lib/function_base.py:2853: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]

Transformer HolidayTransformer failed on fit from params mean {'0': {'threshold': 0.9, 'splash_threshold': None, 'use_dayofmonth_holidays': True, 'use_wkdom_holidays': True, 'use_wkdeom_holidays': False, 'use_lunar_holidays': False, 'use_lunar_weekday': False, 'use_islamic_holidays': False, 'use_hebrew_holidays': False, 'anomaly_detector_params': {'method': 'IQR', 'method_params': {'iqr_threshold': 2.0, 'iqr_quantiles': [0.25, 0.75]}, 'fillna': 'ffill', 'transform_dict': {'fillna': 'pchip', 'transformations': {'0': 'AlignLastValue'}, 'transformation_params': {'0': {'rows': 1, 'lag': 2, 'method': 'additive', 'strength': 1.0, 'first_value_only': False}}}, 'isolated_only': True}, 'remove_excess_anomalies': True, 'impact': 'datepart_regression', 'regression_params': {'regression_model': {'model': 'DecisionTree', 'model_params': {'max_depth': None, 'min_samples_split': 1.0}}, 'datepart_method': 'simple_2', 'polynomial_degree': None, 'transform_dict': None, 'holiday_countries_used': False}}, '1': {}}

Was this transformer updated/modified in the last update and is there any chance a bug arose from that or is this more likely a problem on my end?

Thanks for your reply in advance.

winedarksea commented 1 month ago

I did make some changes to the anomaly detector but not to the methods used here. The following code using that transformer works fine for me:

import pandas as pd
from autots import load_daily, GeneralTransformer

df = load_daily(long=False)

transformer  = GeneralTransformer(
    fillna='ffill',
    transformations={"0": "HolidayTransformer"},
    transformation_params={
        '0': {
            'threshold': 0.9, 'splash_threshold': None,
            'use_dayofmonth_holidays': True, 'use_wkdom_holidays': True,
            'use_wkdeom_holidays': False, 'use_lunar_holidays': False,
            'use_lunar_weekday': False, 'use_islamic_holidays': False,
            'use_hebrew_holidays': False,
            'anomaly_detector_params': {
                'method': 'IQR',
                'method_params': {
                    'iqr_threshold': 2.0, 'iqr_quantiles': [0.25, 0.75]},
                'fillna': 'ffill',
                'transform_dict': {
                    'fillna': 'pchip',
                    'transformations': {'0': 'AlignLastValue'},
                    'transformation_params': {'0': {'rows': 1, 'lag': 2, 'method': 'additive', 'strength': 1.0, 'first_value_only': False}}},
                'isolated_only': True
            },
            'remove_excess_anomalies': True,
            'impact': 'datepart_regression',
            'regression_params': {
                'regression_model': {
                    'model': 'DecisionTree',
                    'model_params': {'max_depth': None, 'min_samples_split': 1.0}},
                'datepart_method': 'simple_2',
                'polynomial_degree': None,
                'transform_dict': None,
                'holiday_countries_used': False}
            }, '1': {}}
)
transformed_df = transformer.fit_transform(df)
inverse_df = transformer.inverse_transform(transformed_df)
col = df.columns[0]
pd.concat([df[col], transformed_df[col].rename("transformed"), inverse_df[col].rename("inverse")], axis=1).plot()

image

do you have full details on the model? maybe it was a different parameter? You might also have something new in your data (new nulls, something?) that are causing the failure?

emobs commented 1 month ago

Thank you for the super quick reply.

Here are the full model details on which the error occurs:

Model   ModelParameters TransformationParameters    Ensemble
SeasonalNaive   {"method": "lastvalue", "lag_1": 24, "lag_2": 10}   {"fillna": "mean", "transformations": {"0": "HolidayTransformer", "1": "DifferencedTransformer"}, "transformation_params": {"0": {"threshold": 0.9, "splash_threshold": null, "use_dayofmonth_holidays": true, "use_wkdom_holidays": true, "use_wkdeom_holidays": false, "use_lunar_holidays": false, "use_lunar_weekday": false, "use_islamic_holidays": false, "use_hebrew_holidays": false, "anomaly_detector_params": {"method": "IQR", "method_params": {"iqr_threshold": 2.0, "iqr_quantiles": [0.25, 0.75]}, "fillna": "ffill", "transform_dict": {"fillna": "pchip", "transformations": {"0": "AlignLastValue"}, "transformation_params": {"0": {"rows": 1, "lag": 2, "method": "additive", "strength": 1.0, "first_value_only": false}}}, "isolated_only": true}, "remove_excess_anomalies": true, "impact": "datepart_regression", "regression_params": {"regression_model": {"model": "DecisionTree", "model_params": {"max_depth": null, "min_samples_split": 1.0}}, "datepart_method": "simple_2", "polynomial_degree": null, "transform_dict": null, "holiday_countries_used": false}}, "1": {}}} 0

Some new NaN values are present in the new data on prediction. Could that be the cause of this error? In that case, would defining values for prefill_na or preclean in the model possibly help? Since the transformer oused by the model also performs "fillna": "mean" I doubt if this could be the cause of the error.

winedarksea commented 1 month ago

On the sample data, again it all works:

import pandas as pd
from autots import load_daily, GeneralTransformer

df = load_daily(long=False)

trans = {"0": "HolidayTransformer", "1": "DifferencedTransformer"}
trans_params = {
    '0': {
        'threshold': 0.9, 'splash_threshold': None,
        'use_dayofmonth_holidays': True, 'use_wkdom_holidays': True,
        'use_wkdeom_holidays': False, 'use_lunar_holidays': False,
        'use_lunar_weekday': False, 'use_islamic_holidays': False,
        'use_hebrew_holidays': False,
        'anomaly_detector_params': {
            'method': 'IQR',
            'method_params': {
                'iqr_threshold': 2.0, 'iqr_quantiles': [0.25, 0.75]},
            'fillna': 'ffill',
            'transform_dict': {
                'fillna': 'pchip',
                'transformations': {'0': 'AlignLastValue'},
                'transformation_params': {'0': {'rows': 1, 'lag': 2, 'method': 'additive', 'strength': 1.0, 'first_value_only': False}}},
            'isolated_only': True
        },
        'remove_excess_anomalies': True,
        'impact': 'datepart_regression',
        'regression_params': {
            'regression_model': {
                'model': 'DecisionTree',
                'model_params': {'max_depth': None, 'min_samples_split': 1.0}},
            'datepart_method': 'simple_2',
            'polynomial_degree': None,
            'transform_dict': None,
            'holiday_countries_used': False}
        }, '1': {}
}
transformer  = GeneralTransformer(
    fillna='ffill',
    transformations=trans,
    transformation_params=trans_params
)
transformed_df = transformer.fit_transform(df)
inverse_df = transformer.inverse_transform(transformed_df, trans_method='original')
col = df.columns[0]
pd.concat([df[col], transformed_df[col].rename("transformed"), inverse_df[col].rename("inverse")], axis=1).plot()

from autots import ModelPrediction

forecast_length = 30
model = ModelPrediction(
    forecast_length=forecast_length,
    model_str="SeasonalNaive",
    parameter_dict={"method": "lastvalue", "lag_1": 24, "lag_2": 10},
    transformation_dict={
        "fillna": "mean",
        "transformations": trans,
        "transformation_params": trans_params,
    },
)
prediction = model.fit_predict(df, forecast_length=forecast_length)
prediction.plot_grid(df)

error bounds are pretty wide but otherwise looks fine: image

I tried injecting zero, null, and negative one in the end of the dataframe (df.iloc[-1] = np.nan) and while they messed with the outputs a bit, as one expects, they didn't cause any errors.