business-science / modeltime

Modeltime unlocks time series forecast models and machine learning in one framework
https://business-science.github.io/modeltime/
Other
532 stars 82 forks source link

Residual Diagnostics - Function to visualize residuals #22

Closed spsanderson closed 4 years ago

spsanderson commented 4 years ago

I think it would be great to be able to extract from a modeltime_table the model description and the associated data with it this allows one to look at the residuals ect. how they want.

mdancho84 commented 4 years ago

Do you mean visualize the calibration residuals?

spsanderson commented 4 years ago

I do

On Sun, Aug 23, 2020 at 8:56 AM Matt Dancho notifications@github.com wrote:

Do you mean visualize the calibration residuals?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/business-science/modeltime/issues/22#issuecomment-678771164, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPCNS6TEYGOPFV4ZA2CMEDSCEGWPANCNFSM4QD7EFKA .

-- Steven P Sanderson II, MPH Book on Lulu http://goo.gl/lmrlFI

mdancho84 commented 4 years ago

Ok, yes, this is a great idea and something I've been contemplating too (just haven't had time to do it yet).

Implementation

I'm considering development of a modeltime_residuals() function and plot_modeltime_residuals() where the calibration tibble to evaluate out-of-sample residuals. It would work similar to modeltime_forecast() where the data is generated, then the plotting function makes it easy to visualize it.

spsanderson commented 4 years ago

Awesome in the meantime I’ll keep hacking away at it and see if I can get something working to send you

Sent from my iPhone

On Aug 23, 2020, at 2:15 PM, Matt Dancho notifications@github.com wrote:

 Ok, yes, this is a great idea and something I've been contemplating too (just haven't had time to do it yet).

Implementation

I'm considering development of a modeltime_residuals() function and plot_modeltime_residuals() where the calibration tibble to evaluate out-of-sample residuals. It would work similar to modeltime_forecast() where the data is generated, then the plotting function makes it easy to visualize it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

mdancho84 commented 4 years ago

Improvements - Residuals & Accuracy


# SETUP ----

library(modeltime)
library(tidymodels)
library(tidyverse)
library(timetk)
library(lubridate)

m750 <- m4_monthly %>%
    filter(id == "M750")

splits <- initial_time_split(m750, prop = 0.9)

# MODELS ----

model_fit_arima <- arima_reg() %>%
    set_engine("auto_arima") %>%
    fit(value ~ date, training(splits))
#> frequency = 12 observations per 1 year

model_fit_prophet <- prophet_reg() %>%
    set_engine("prophet") %>%
    fit(value ~ date, training(splits))
#> Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
#> Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

model_fit_lm <- linear_reg() %>%
    set_engine("lm") %>%
    fit(value ~ splines::ns(date, df = 5) 
        + month(date, label = TRUE), 
        training(splits))

# CALIBRATION ----

model_tbl <- modeltime_table(
    model_fit_arima,
    model_fit_prophet,
    model_fit_lm
)

calibration_tbl <- model_tbl  %>%
    modeltime_calibrate(testing(splits))

# ACCURACY ----

# Out-of-sample 
calibration_tbl %>% modeltime_accuracy()
#> # A tibble: 3 x 9
#>   .model_id .model_desc             .type   mae  mape  mase smape  rmse   rsq
#>       <int> <chr>                   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1         1 ARIMA(0,1,1)(0,1,1)[12] Test   151.  1.41 0.516  1.43  198. 0.930
#> 2         2 PROPHET                 Test   178.  1.70 0.609  1.71  235. 0.880
#> 3         3 LM                      Test   156.  1.55 0.534  1.52  236. 0.915

# In-sample
calibration_tbl %>% modeltime_accuracy(training(splits))
#> # A tibble: 3 x 9
#>   .model_id .model_desc             .type    mae  mape  mase smape  rmse   rsq
#>       <int> <chr>                   <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1         1 ARIMA(0,1,1)(0,1,1)[12] Fitted  104.  1.19 0.409  1.19  154. 0.988
#> 2         2 PROPHET                 Fitted  157.  1.80 0.613  1.80  212. 0.977
#> 3         3 LM                      Test    180.  2.05 0.704  2.05  247. 0.969

# RESIDUALS - Time Plot ----

# Out of Sample
calibration_tbl %>%
    modeltime_residuals() %>%
    plot_modeltime_residuals(.type = "timeplot", .interactive = F)


# In Sample
calibration_tbl %>%
    modeltime_residuals(training(splits)) %>%
    plot_modeltime_residuals(.type = "timeplot", .interactive = F)


# RESIDUALS - ACF

# Out of Sample
calibration_tbl %>%
    modeltime_residuals() %>%
    plot_modeltime_residuals(.type = "acf", .interactive = F)
#> Max lag exceeds data available. Using max lag: 30
#> Max lag exceeds data available. Using max lag: 30
#> Max lag exceeds data available. Using max lag: 30


# In Sample
calibration_tbl %>%
    modeltime_residuals(training(splits)) %>%
    plot_modeltime_residuals(.type = "acf", .interactive = F)
#> Max lag exceeds data available. Using max lag: 274
#> Max lag exceeds data available. Using max lag: 274
#> Max lag exceeds data available. Using max lag: 274


# RESIDUALS - Seasonality

# Out of Sample
calibration_tbl %>%
    modeltime_residuals() %>%
    plot_modeltime_residuals(.type = "seasonality", .interactive = F)


# In Sample
calibration_tbl %>%
    modeltime_residuals(training(splits)) %>%
    plot_modeltime_residuals(.type = "seasonality", .interactive = F)

Created on 2020-08-24 by the reprex package (v0.3.0)

mdancho84 commented 4 years ago

Closing this. Residuals are taken care of. :)