I work with monthly values and have noticed that a model with the same parameters (=default yearly seasonality) predicts completely different values when changing "ds" to the first day or the last day of month. Thereby the fitted values are the same for both models. I assume this is due to the underlying model, which is continuous-time as mentioned in the documentation (see https://facebook.github.io/prophet/docs/non-daily_data.html#monthly-data).
However, in some cases the prediction results differ greatly and more than I had expected. I could reduce this effect by adding yearly seasonality as regressors. But apart from that, can it be an advantage or disadvantage to choose one of the variants (beginning or end of month)? It would be awesome if someone can explain to me why the differences are that big.
Code example for reproduction:
# create sample data
df = pd.DataFrame(data={'ds':['2021-01-01','2021-02-01','2021-03-01','2021-04-01','2021-05-01','2021-06-01',
'2021-07-01','2021-08-01','2021-09-01','2021-10-01','2021-11-01','2021-12-01'],
'y':[1,2,3,5,6,7,2,3,4,5,7,8]})
df['ds'] = pd.to_datetime(df['ds'])
### 1. model using first day of month
# initialize model
model1 = Prophet(yearly_seasonality = True,
weekly_seasonality=False,
daily_seasonality=False,
seasonality_mode='additive',
growth='linear')
# fit model
model1.fit(df, iter=1000) # reduce computing time
# create future dataframe for prediction
future = model1.make_future_dataframe(periods=1,
freq = 'MS')
# predict
df_pred = model1.predict(future)
### 2. model using last day of month
# change ds to eom
df['ds'] = df['ds'] + pd.offsets.MonthEnd(0)
# initialize model
model2 = Prophet(yearly_seasonality = True,
weekly_seasonality=False,
daily_seasonality=False,
seasonality_mode='additive',
growth='linear')
# fit model
model2.fit(df, iter=1000) # reduce computing time
# create future dataframe for prediction
future = model2.make_future_dataframe(periods=1,
freq = 'M')
# predict
df_pred_eom = model2.predict(future)
### 3. compare results
display(df_pred - df_pred_eom)
I work with monthly values and have noticed that a model with the same parameters (=default yearly seasonality) predicts completely different values when changing "ds" to the first day or the last day of month. Thereby the fitted values are the same for both models. I assume this is due to the underlying model, which is continuous-time as mentioned in the documentation (see https://facebook.github.io/prophet/docs/non-daily_data.html#monthly-data).
However, in some cases the prediction results differ greatly and more than I had expected. I could reduce this effect by adding yearly seasonality as regressors. But apart from that, can it be an advantage or disadvantage to choose one of the variants (beginning or end of month)? It would be awesome if someone can explain to me why the differences are that big.
Code example for reproduction: