Cross Validation Without Weekend Data Along with Meaning of Cross Validation?

@bletham @tcuongd

I am running the prophet model without weekend data in the initial dataset. When creating a future dataframe, I remove weekends.

new<- df %>% 
  group_by(group) %>% 
  mutate(weekdays = weekdays(ds)) %>% 
  do(predict(prophet(., daily.seasonality = TRUE, yearly.seasonality = TRUE), 
             filter(make_future_dataframe(prophet(., daily.seasonality = TRUE, yearly.seasonality = TRUE), periods = 14), weekdays(ds) != "Saturday" & weekdays(ds) != "Sunday"))) %>%
  select(ds, group, yhat)

When doing cross_validation function with cutoffs and creating multiple models, will the diagnostic values (RMSE, MSE, etc.) be skewed because we cannot remove weekend data in the future dataframe?

diagnostics_2 <- df %>%
  group_by(group) %>%
  do(cross_validation(prophet(., daily.seasonality = TRUE, yearly.seasonality = TRUE),
                      initial = 365.25, period = 180, horizon = 14, units = 'days') 
     %>% performance_metrics()
       )

My assumption would be yes and I would need to do the cross validation manually (not using the cross_validation function) with cutoffs and creating a future dataframe with weekends removed. Then calculate the diagnostic metrics manually.

(I'm doing this using R by the way)

My second question is that the result currently is by horizon. If I just want one number, per group (specifically RMSE), would you recommend to average all the RMSE values?

# A tibble: 26 x 9
# Groups:   group [2]
   group horizon     mse  rmse   mae   mape  mdape  smape coverage
   <chr> <drtn>    <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>    <dbl>
 1 AAMC   2 days  2.03   1.42  1.40  0.0669 0.0762 0.0694    1    
 2 AAMC   3 days  4.87   2.21  2.03  0.108  0.0767 0.104     0.667
 3 AAMC   4 days  5.03   2.24  2.03  0.112  0.0762 0.107     0.667
 4 AAMC   5 days  5.38   2.32  2.13  0.117  0.0866 0.112     0.667
 5 AAMC   6 days  5.82   2.41  2.19  0.118  0.120  0.111     0.333
 6 AAMC   7 days  8.26   2.87  2.59  0.147  0.190  0.135     0.333
 7 AAMC   8 days  5.73   2.39  1.96  0.112  0.175  0.103     0.444
 8 AAMC   9 days  2.33   1.53  1.28  0.0681 0.0388 0.0663    0.833
 9 AAMC  10 days  5.21   2.28  1.87  0.101  0.0530 0.0943    0.667
10 AAMC  11 days 14.1    3.76  3.21  0.195  0.212  0.172     0.333
11 AAMC  12 days 18.1    4.25  3.64  0.230  0.213  0.198     0.333
12 AAMC  13 days 20.4    4.52  4.05  0.249  0.213  0.213     0.333
13 AAMC  14 days 22.3    4.72  4.38  0.267  0.207  0.228     0    
14 AAU    2 days  0.0634 0.252 0.229 0.481  0.264  0.374     0    
15 AAU    3 days  0.0388 0.197 0.182 0.456  0.264  0.510     0    
16 AAU    4 days  0.0190 0.138 0.124 0.336  0.251  0.431     0    
17 AAU    5 days  0.0192 0.139 0.127 0.347  0.265  0.444     0    
18 AAU    6 days  0.0572 0.239 0.202 0.499  0.685  0.552     0    
19 AAU    7 days  0.0559 0.236 0.206 0.500  0.681  0.561     0    
20 AAU    8 days  0.0597 0.244 0.201 0.420  0.669  0.383     0    
21 AAU    9 days  0.0717 0.268 0.238 0.466  0.669  0.363     0    
22 AAU   10 days  0.0777 0.279 0.266 0.633  0.747  0.701     0    
23 AAU   11 days  0.0300 0.173 0.156 0.424  0.399  0.562     0    
24 AAU   12 days  0.0312 0.177 0.163 0.437  0.385  0.583     0    
25 AAU   13 days  0.0720 0.268 0.240 0.586  0.768  0.680     0    
26 AAU   14 days  0.0538 0.232 0.216 0.517  0.537  0.640     0

Result would therefore be:

                   RMSE
     AAMC:      2.84
     AAU:         0.21

Hey, sorry didn't notice this issue, the answer to the first question is here: https://github.com/facebook/prophet/issues/323#issuecomment-1037215919

For the second question, the cross-validation part of this page describes the default behaviour: https://facebook.github.io/prophet/docs/diagnostics.html

The performance_metrics function takes the last 10% of predictions in each horizon and calculates the rmse across those datapoints. I think the reason for this default is that in practice we are usually interested in forecasting out to a particular point in the future, after which we would re-train with the latest data and predict again. So averaging across all future datapoints might not be representative of the "success" of the forecast in practice.
If you think averaging the error across all datapoints in the horizon is more suitable, you can set the argument rolling_window = 1.0 argument in performance_metrics function.

facebook / prophet

Cross Validation Without Weekend Data Along with Meaning of Cross Validation? #2097