linkedin / greykite

A flexible, intuitive and fast forecasting library
BSD 2-Clause "Simplified" License
1.81k stars 106 forks source link

The default performance of greykite silverkite is far worse than default prophet #91

Closed bhishanpdl closed 2 years ago

bhishanpdl commented 2 years ago

I am recently familiar with this time series modeling library. I am trying to use this model for some regular sales data which have some trends and some seasonality and do not contain outliers.

Question: How to change default parameters of greykite so that it gives some realiable output?

In my usecase I have following results:

I am looking for some suggestions how to improve the performance of greykite model.

Vanill Greykite code

# imports
import time
time_start_notebook = time.time()
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# timeseries
import greykite
from greykite.framework.templates.autogen.forecast_config import MetadataParam
from greykite.framework.templates.forecaster import Forecaster
from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.model_templates import ModelTemplateEnum
from greykite.framework.templates.autogen.forecast_config import ComputationParam

import warnings
warnings.filterwarnings("ignore")

# variables
col_date = 'BillingDate'
col_val = 'y_true'

fh_val = 90
freq='D'
verbose=1

# data
url = "https://github.com/bhishanpdl/Shared/blob/master/data/daily_0600075027.csv"
dfs = pd.read_html(url,parse_dates=[col_date])
df = dfs[0].iloc[:,1:]

# modelling
metadata = MetadataParam(time_col=col_date,value_col=col_val,freq="D")
computation_param = ComputationParam(hyperparameter_budget=None,n_jobs=-1,verbose=verbose)
config=ForecastConfig(
    model_template=ModelTemplateEnum.AUTO.name,
    forecast_horizon=fh_val,
    coverage=0.95,
    metadata_param=metadata,
    computation_param=computation_param
)

forecaster = Forecaster()
result = forecaster.run_forecast_config(df=df,config=config)
df_val = result.backtest.df.tail(fh_val)

# model evaluation
sum_val_true = df_val['actual'].sum()
sum_val_pred = df_val['forecast'].sum()
pct_diff = (sum_val_true - sum_val_pred) / sum_val_true * 100
print(f"""
val  = {sum_val_true:,.0f}
pred = {sum_val_pred:,.0f}
pct diff = {pct_diff:,.2f}%
""")

_ = df_val.set_index(col_date)[['actual','forecast']].plot(figsize=(12,8))
_ = plt.title(f"pct diff = {pct_diff:,.2f}%")
_ = plt.show()

time_taken = time.time() - time_start_notebook
m,s = divmod(time_taken,60)
print(f'Time taken: {m:.0f} min {s:.0f} sec')

Vanilla Prophet code

# imports
import time
time_start_notebook = time.time()

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import prophet
from prophet import Prophet

# get the data
col_date = 'BillingDate'
col_val = 'y_true'
fh_val = 90

# data
url = "https://github.com/bhishanpdl/Shared/blob/master/data/daily_0600075027.csv"
dfs = pd.read_html(url,parse_dates=[col_date])
df = dfs[0].iloc[:,1:]

print(df.shape)
print(df.head())

# prophet expects two columns: ds and y
df_prophet = pd.DataFrame({'ds': df[col_date].values, 'y': df[col_val].to_numpy()})

# train-validation split
df_train = df_prophet.iloc[:-fh_val,:]
df_val = df_prophet.iloc[-fh_val:,:]

# modelling
model = Prophet()

# fit the model
_ = model.fit(df_train);

# prediction
periods = fh_val
df_fut = model.make_future_dataframe(periods=periods)
df_pred = model.predict(df_fut)
df_fut = df_pred.iloc[-fh_val:, [0,1,-2,-1]]

# model evaluation
df_eval = pd.DataFrame({'y_true': df_val['y'].values, 
                       'y_pred': df_fut['yhat'].values},
                      index=df_val['ds'].values)

sum_val_true = df_eval['y_true'].sum()
sum_val_pred = df_eval['y_pred'].sum()
pct_diff = (sum_val_true - sum_val_pred) / sum_val_true * 100
print(f"""
val  = {sum_val_true:,.0f}
pred = {sum_val_pred:,.0f}

pct diff = {pct_diff:,.2f}%
""")

df_eval.plot()

time_taken = time.time() - time_start_notebook
m,s = divmod(time_taken,60)
print(f'Time taken: {m:.0f} min {s:.0f} sec')
sayanpatra commented 2 years ago

Hi,

The AUTO template seemed to have picked up a changepoint at the end of the training period, and hence the difference. You can visualize it by plotting the backtest via

fig = result.backtest.plot()
fig.show()
Screen Shot 2022-09-01 at 12 02 19 AM

Because of the inherent volatility in your data, the SILVERKITE template should perform better. You can do that by simply replacing model_template=ModelTemplateEnum.AUTO.name by model_template=ModelTemplateEnum.SILVERKITE.name. This reduces the percent difference to -1.53%.

Screen Shot 2022-09-01 at 12 05 05 AM

For any time-series data, we advise checking the performance on multiple test sets and taking the average. For example, if you drop the last 10 days from df, then the percent error becomes -7.59%.

bhishanpdl commented 2 years ago

Thanks a lot for your feedback. I was assuming AUTO means SILVERKITE. Also, Looking at the changepoints in TEST data is not possible, but we could surely look at the cv results and see if there are any cross-validation data.

This was my very first attempt to use greykite in production and the very first dataset I compared between prophet and greykite, i was very disappointed.

I was putting the thought of using greykite on hold and move on with prophet as status quo, but now I see the potential again and am happy to give it a try.

sayanpatra commented 2 years ago

We have multiple templates for different datasets, AUTO template tries to pick the best one. For highly volatile data such as yours, it is better to pick SILVERKITE, which builds a more complex, but robust model.

You can easily enable CV and grid-search, check this tutorial.