business-science / modeltime

Modeltime unlocks time series forecast models and machine learning in one framework
https://business-science.github.io/modeltime/
Other
522 stars 79 forks source link

prophet_reg() and prophet_boost() yielding identical results #192

Closed disaltzman closed 2 years ago

disaltzman commented 2 years ago

I'm trying to evaluate whether adding gradient boosting to Prophet will improve the accuracy on a dataset where Prophet is already is performing well. I noticed I was getting identical accuracy scores using both prophet_reg() and prophet_boost. I've created a reproducible example here:

m750 <- m4_monthly %>% filter(id == "M750")
m750

splits <- initial_time_split(m750, prop = 0.8)

model_fit_proph_boost <- prophet_boost(
    learn_rate = 0.1
) %>%
    set_engine("prophet_xgboost") %>% 
  fit(value ~ date,training(splits))

model_fit_proph <- prophet_reg() %>%
    set_engine("prophet") %>% 
  fit(value ~ date,training(splits))

models_tbl <- modeltime_table(
    model_fit_proph_boost,
    model_fit_proph
)

models_tbl %>% 
    modeltime_calibrate(
        new_data = testing(splits),
    ) %>% 
  modeltime_accuracy()

A tibble: 2 × 9
  .model_id .model_desc .type   mae  mape  mase smape  rmse   rsq
      <int> <chr>       <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1         1 PROPHET     Test   271.  2.74 0.808  2.67  364. 0.812
2         2 PROPHET     Test   271.  2.74 0.808  2.67  364. 0.812

Please let me know if I am misunderstanding or am not using the correct syntax, but I assumed that the results would not be literally identical. If I am misunderstanding, I apologize in advance!

AlbertoAlmuinha commented 2 years ago

Hi,

As far as I know, the reason is that you don’t have any external regressors so you are not really modeling the residuals with the Xgboost algorithm…

Basically you are adding 0’s to your prophet predictions.

mdancho84 commented 2 years ago

Yep, @AlbertoAlmuinha is right. If you don't have any external regressors in your model, then xgboost won't run. If you add time series features such as with step_timeseries_signature() from timetk, then you should be able to improve the results.

I'm going to close this since it's not a bug. But more understanding the difference between prophet_reg() and prophet_boost()