facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.32k stars 4.52k forks source link

model seasonality grows too much #1673

Closed justethomas closed 3 years ago

justethomas commented 4 years ago

Hi everyone,

I seem to have an overfit problem with my model regarding the growth of the prediction over time. you can see the prophet prediction in blue compared with a tbats prediction in green and the serie in red

image

I don't have this kind of results on other tests with train sets more stables over time. Here's a snippet of my code.

def hyperparameter_tuning(df: pd.DataFrame) -> pd.DataFrame:
    all_params = [dict(zip(PARAM_GRID.keys(), v)) for v in itertools.product(*PARAM_GRID.values())]

    mse = []

    for params in all_params:

        m = Prophet(**params).fit(df)
        df_cv = cross_validation(
            m,
            initial = '{} hours'.format(len(train_p)),
            horizon = '147 hours',
        )

        df_p = performance_metrics(df_cv, rolling_window = 1)
        mse.append(df_p['mse'].values[0])

    tuning_results = pd.DataFrame(all_params)
    tuning_results['mse'] = mse
    tuning_results.sort_values('mse', inplace = True)

    return tuning_results

def prophet_forecast(m, train: pd.DataFrame, test: pd.DataFrame, changepoint_prior_scale: float,
                     seasonality_prior_scale: float) -> pd.DataFrame:
    m.weekly_seasonality = args.weekly_weight
    m.daily_seasonality = args.daily_weight
    m.seasonality_mode = 'additive'
    m.changepoint_prior_scale = changepoint_prior_scale,
    m.changepoint_prior_scale = seasonality_prior_scale
    m.growth = 'linear'

    m.stan_backend.logger = None
    m.fit(train)

    future = m.make_future_dataframe(periods = len(test), freq = 'H')

    forecast = m.predict(future)

    return forecast.tail(len(test))

params = hyperparameter_tuning(df)

m = Prophet()

prophet_prediction = prophet_forecast(
            m,
            train_p,
            test_p,
            params.changepoint_prior_scale[0],
            params.seasonality_prior_scale[0])

the hyperparameter_tuning process improves my already good predictions, but doesn't help at all with very bad predictions such as this one

my next idea was to use the cap feature of the 'logistic' growth mode, but this seems a bit like a fraud since I theoretically don't have any limit (this data is website visits)

Do you guys have any ideas which parameters should I tune please?

PS: I use as a training set the log of my real values

bletham commented 4 years ago

The Prophet model assumes stationary seasonality. In this case the magnitude of daily seasonality is clearly fluctuating, and generally increasing in time. The only way Prophet can capture that is with multiplicative seasonality on a fluctuating trend, that is generally increasing with time. With log transformed data, the additive seasonality you are using is equivalent to multiplicative seasonality so that is basically what it is happening here. In the future prediction, the trend (and thus magnitude of daily seasonality) is increasing in a really unreasonable way. I suspect this is because of the log transform. The exp() inverse transform is a bit unstable and I've seen it amplify small trend changes into very unreasonable large ones, much along the lines of what is happening here.

I'm guessing that you're using the log transform to get positive predictions. I've been doing some analysis of different strategies for that lately (one of which is a log transform) and just posted about it in #1668. There, I posted a new strategy which I implemented in a ProphetPos class that I think could do a lot better on this time series. Is there any chance you could post the data for this time series so I could try it out?

justethomas commented 4 years ago

I see, I ran it again without the log transformation and the results make more sense.

image

Regarding the dataset, unfortunately i'm not allowed to share it as it is corporate data.

Thank you for your help!