facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.27k stars 4.51k forks source link

Cross Validation Without Weekend Data Along with Meaning of Cross Validation? #2097

Open nknauer opened 2 years ago

nknauer commented 2 years ago

@bletham @tcuongd

I am running the prophet model without weekend data in the initial dataset. When creating a future dataframe, I remove weekends.

new<- df %>% 
  group_by(group) %>% 
  mutate(weekdays = weekdays(ds)) %>% 
  do(predict(prophet(., daily.seasonality = TRUE, yearly.seasonality = TRUE), 
             filter(make_future_dataframe(prophet(., daily.seasonality = TRUE, yearly.seasonality = TRUE), periods = 14), weekdays(ds) != "Saturday" & weekdays(ds) != "Sunday"))) %>%
  select(ds, group, yhat)

When doing cross_validation function with cutoffs and creating multiple models, will the diagnostic values (RMSE, MSE, etc.) be skewed because we cannot remove weekend data in the future dataframe?

diagnostics_2 <- df %>%
  group_by(group) %>%
  do(cross_validation(prophet(., daily.seasonality = TRUE, yearly.seasonality = TRUE),
                      initial = 365.25, period = 180, horizon = 14, units = 'days') 
     %>% performance_metrics()
       )

My assumption would be yes and I would need to do the cross validation manually (not using the cross_validation function) with cutoffs and creating a future dataframe with weekends removed. Then calculate the diagnostic metrics manually.

(I'm doing this using R by the way)

My second question is that the result currently is by horizon. If I just want one number, per group (specifically RMSE), would you recommend to average all the RMSE values?

# A tibble: 26 x 9
# Groups:   group [2]
   group horizon     mse  rmse   mae   mape  mdape  smape coverage
   <chr> <drtn>    <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>    <dbl>
 1 AAMC   2 days  2.03   1.42  1.40  0.0669 0.0762 0.0694    1    
 2 AAMC   3 days  4.87   2.21  2.03  0.108  0.0767 0.104     0.667
 3 AAMC   4 days  5.03   2.24  2.03  0.112  0.0762 0.107     0.667
 4 AAMC   5 days  5.38   2.32  2.13  0.117  0.0866 0.112     0.667
 5 AAMC   6 days  5.82   2.41  2.19  0.118  0.120  0.111     0.333
 6 AAMC   7 days  8.26   2.87  2.59  0.147  0.190  0.135     0.333
 7 AAMC   8 days  5.73   2.39  1.96  0.112  0.175  0.103     0.444
 8 AAMC   9 days  2.33   1.53  1.28  0.0681 0.0388 0.0663    0.833
 9 AAMC  10 days  5.21   2.28  1.87  0.101  0.0530 0.0943    0.667
10 AAMC  11 days 14.1    3.76  3.21  0.195  0.212  0.172     0.333
11 AAMC  12 days 18.1    4.25  3.64  0.230  0.213  0.198     0.333
12 AAMC  13 days 20.4    4.52  4.05  0.249  0.213  0.213     0.333
13 AAMC  14 days 22.3    4.72  4.38  0.267  0.207  0.228     0    
14 AAU    2 days  0.0634 0.252 0.229 0.481  0.264  0.374     0    
15 AAU    3 days  0.0388 0.197 0.182 0.456  0.264  0.510     0    
16 AAU    4 days  0.0190 0.138 0.124 0.336  0.251  0.431     0    
17 AAU    5 days  0.0192 0.139 0.127 0.347  0.265  0.444     0    
18 AAU    6 days  0.0572 0.239 0.202 0.499  0.685  0.552     0    
19 AAU    7 days  0.0559 0.236 0.206 0.500  0.681  0.561     0    
20 AAU    8 days  0.0597 0.244 0.201 0.420  0.669  0.383     0    
21 AAU    9 days  0.0717 0.268 0.238 0.466  0.669  0.363     0    
22 AAU   10 days  0.0777 0.279 0.266 0.633  0.747  0.701     0    
23 AAU   11 days  0.0300 0.173 0.156 0.424  0.399  0.562     0    
24 AAU   12 days  0.0312 0.177 0.163 0.437  0.385  0.583     0    
25 AAU   13 days  0.0720 0.268 0.240 0.586  0.768  0.680     0    
26 AAU   14 days  0.0538 0.232 0.216 0.517  0.537  0.640     0 

Result would therefore be:

                   RMSE
     AAMC:      2.84
     AAU:         0.21
tcuongd commented 2 years ago

Hey, sorry didn't notice this issue, the answer to the first question is here: https://github.com/facebook/prophet/issues/323#issuecomment-1037215919

For the second question, the cross-validation part of this page describes the default behaviour: https://facebook.github.io/prophet/docs/diagnostics.html