Closed robjhyndman closed 4 months ago
Best practice is to use factors here, so that all possible values of Day_Type
are known in both the modelling and forecasting stages - even if they are not observed in that time window.
That said, this behaviour is likely to cause issues (but is hard to fix), so I've opened another issue for it here: https://github.com/tidyverts/fabletools/issues/398
library(fpp3)
#> -- Attaching packages ---------------------------------------------- fpp3 0.5 --
#> v tibble 3.2.1 v tsibble 1.1.4
#> v dplyr 1.1.3 v tsibbledata 0.4.1
#> v tidyr 1.3.0 v feasts 0.3.1.9000
#> v lubridate 1.9.3 v fable 0.3.3.9000
#> v ggplot2 3.5.0 v fabletools 0.4.0
#> -- Conflicts ------------------------------------------------- fpp3_conflicts --
#> x lubridate::date() masks base::date()
#> x dplyr::filter() masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval() masks lubridate::interval()
#> x dplyr::lag() masks stats::lag()
#> x tsibble::setdiff() masks base::setdiff()
#> x tsibble::union() masks base::union()
elec <- tsibbledata::vic_elec %>%
mutate(
Day_Type = factor(case_when(
Holiday ~ "Holiday",
wday(Date) %in% 2:6 ~ "Weekday",
TRUE ~ "Weekend"
)) )
fit <- elec %>%
model(shf = fable::TSLM(log(Demand) ~ Day_Type))
fit %>% report()
#> Series: Demand
#> Model: TSLM
#> Transformation: log(Demand)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.47374 -0.11822 0.01978 0.10979 0.66244
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 8.293077 0.004406 1882.134 < 2e-16 ***
#> Day_TypeWeekday 0.187077 0.004496 41.610 < 2e-16 ***
#> Day_TypeWeekend 0.032076 0.004620 6.943 3.88e-12 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.17 on 52605 degrees of freedom
#> Multiple R-squared: 0.1572, Adjusted R-squared: 0.1571
#> F-statistic: 4905 on 2 and 52605 DF, p-value: < 2.22e-16
newdata <- tail(elec, 48)
fit %>%
forecast(new_data = newdata)
#> # A fable: 48 x 8 [30m] <Australia/Melbourne>
#> # Key: .model [1]
#> .model Time Demand .mean Temperature Date
#> <chr> <dttm> <dist> <dbl> <dbl> <date>
#> 1 shf 2014-12-31 00:00:00 t(N(8.5, 0.029)) 4888. 16.2 2014-12-31
#> 2 shf 2014-12-31 00:30:00 t(N(8.5, 0.029)) 4888. 16 2014-12-31
#> 3 shf 2014-12-31 01:00:00 t(N(8.5, 0.029)) 4888. 15.5 2014-12-31
#> 4 shf 2014-12-31 01:30:00 t(N(8.5, 0.029)) 4888. 15 2014-12-31
#> 5 shf 2014-12-31 02:00:00 t(N(8.5, 0.029)) 4888. 14.4 2014-12-31
#> 6 shf 2014-12-31 02:30:00 t(N(8.5, 0.029)) 4888. 14.3 2014-12-31
#> 7 shf 2014-12-31 03:00:00 t(N(8.5, 0.029)) 4888. 14 2014-12-31
#> 8 shf 2014-12-31 03:30:00 t(N(8.5, 0.029)) 4888. 13.8 2014-12-31
#> 9 shf 2014-12-31 04:00:00 t(N(8.5, 0.029)) 4888. 13.6 2014-12-31
#> 10 shf 2014-12-31 04:30:00 t(N(8.5, 0.029)) 4888. 13.3 2014-12-31
#> # i 38 more rows
#> # i 2 more variables: Holiday <lgl>, Day_Type <fct>
Created on 2024-03-02 with reprex v2.0.2
MRE
Created on 2022-05-23 by the reprex package (v2.0.1)