Closed USMortality closed 1 year ago
Most likely you are rounding before calculating. Here is the calculation using R:
library(fpp3)
fc <- data.frame(
year = seq(1, 5),
asmr = c(882, 876, 879, 871, 867)
) |>
as_tsibble(index = year) |>
model(TSLM(asmr ~ trend())) |>
forecast(h = 5)
fc |>
mutate(
.var = distributional::variance(asmr),
.lo = .mean - 1.96*sqrt(.var),
.hi = .mean + 1.96*sqrt(.var)
) |>
hilo(level = 95)
#> # A tsibble: 5 x 8 [1]
#> # Key: .model [1]
#> .model year asmr .mean .var .lo .hi `95%`
#> <chr> <dbl> <dist> <dbl> <dbl> <dbl> <dbl> <hilo>
#> 1 TSLM(asmr ~ t… 6 N(864, 16) 864. 16.4 857. 872. [856.5507, 872.4493]95
#> 2 TSLM(asmr ~ t… 7 N(861, 22) 861 21.9 852. 870. [851.8209, 870.1791]95
#> 3 TSLM(asmr ~ t… 8 N(858, 29) 858. 29.0 847. 868. [846.9483, 868.0517]95
#> 4 TSLM(asmr ~ t… 9 N(854, 38) 854 37.6 842. 866. [841.9817, 866.0183]95
#> 5 TSLM(asmr ~ t… 10 N(850, 48) 850. 47.8 837. 864. [836.9517, 864.0483]95
Created on 2023-10-07 with reprex v2.0.2
Thanks for your reply. Your code is fine - however, the discrepancy seems to be in the variance/.var
value.
Here's the R code that calculates the variance/.var
manually, and compares it to the fabletools calculations. Only the MEAN approach matches exactly. I think it all comes down to the formulas used in the .var = ...
step, which I took from table 5.2 (chapter 5.5) of your book. What might I be missing? Thanks!
library(fpp3)
# Data
ts <- data.frame(
year = seq(1, 5),
asmr = c(882, 876, 879, 871, 867)
) |> as_tsibble(index = year)
T_len <- nrow(ts)
# MEAN
model <- ts |>
mutate(.fitted = mean(asmr)) |>
mutate(.resid = asmr - .fitted)
o <- sd(model$.resid)
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = model$.fitted[1],
h = year - T_len
) |>
mutate(.var = (o * sqrt(1 + 1 / T_len))^2)
fc_fabletools <- ts |>
model(MEAN(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
identical(fc$.var, fc_fabletools$.var) # TRUE
# NAIVE
model <- ts |>
mutate(.fitted = lag(asmr, 1)) |>
mutate(.resid = asmr - .fitted)
o <- sd(model$.resid, na.rm = TRUE)
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = model$asmr[T_len],
h = year - T_len
) |>
mutate(.var = (o * sqrt(h))^2)
fc_fabletools <- ts |>
model(MEAN(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
identical(fc$.var, fc_fabletools$.var) # FALSE
# TSLM/trend
fit <- lm(asmr ~ year, data = ts)
model <- ts |>
mutate(.fitted = fit$coefficients[2] * year + fit$coefficients[1]) |>
mutate(.resid = asmr - .fitted)
o <- sd(model$.resid)
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = fit$coefficients[2] * year + fit$coefficients[1],
h = year - T_len
) |>
mutate(.var = (o * sqrt(h * (1 + h / (T_len - 1))))^2)
fc_fabletools <- ts |>
model(TSLM(asmr ~ trend())) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
identical(fc$.var, fc_fabletools$.var) # FALSE
There are several issues here.
all.equal()
rather than identical()
as it allows for floating point error.NAIVE
code, you fit a MEAN
model.TSLM
code, you compare against the drift variance, which is not the same thing. mean(model$.resid^2, na.rm=TRUE)
because we know the mean is zero (assuming unbiasedness), so there is no need to estimate it. This is correctly implemented for NAIVE
, but not for MEAN
. I'll get it fixed. o <- sqrt(mean(model$.resid^2, na.rm=TRUE))
in the NAIVE
example, fit a NAIVE
model, and replace identical()
with all.equal()
, the results match.Ah ok, that makes sense. Thanks for pointing out the errors.
So, if I understand your reply correctly, fable::MEAN
implementation should also be using the mean of the squared residuals.
Could it be, that this is also the case for the fable::TSLM
function - because even omitting the + trend()
results in a variance of 43.8, identical to the current MEAN variance.
model <- ts |>
mutate(.fitted = mean(asmr)) |>
mutate(.resid = asmr - .fitted)
o <- sqrt(mean(model$.resid^2, na.rm = TRUE))
mean1 <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = model$.fitted[1],
h = year - T_len
) |>
mutate(.var = (o * sqrt(1 + 1 / T_len))^2)
mean2 <- fc_fabletools <- ts |>
model(MEAN(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
tslm <- fc_fabletools <- ts |>
model(TSLM(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
Shouldn't these three all produce identical results?
Ah, sorry. My mistake. We do need to account for the number of parameters estimated when estimating the residual variance. So my point 4 above is only true for models with no parameters, such as NAIVE
. It is not true for the MEAN
model which has one parameter.
For the MEAN
model, because there is one parameter, the residual variance should be sum(res^2)/(T-1)
. Because the sample mean of the residuals is identical to zero, this turns out to be equal to var(res)
, and the fable code is correct.
For the NAIVE
model, there are no parameters, so the residual variance should be sum(res^2)/T
, as I calculated above.
For the TSLM
model, where just a constant is fitted, it is identical to the MEAN
model, and the results should be the same. In your last block of code, if you revert to o <- sd(model$.resid)
, the three models should be the same.
If you do go ahead and check the results for the drift method, that also has one parameter, so the residual variance should be sum(res^2)/(T-1)
.
In general, the residual variance should be sum(res^2)/(T-k)
for models with k estimated parameters. That's true even if the sample mean of the residuals is different from zero, because the population mean is zero by the unbiasedness assumption.
Great, that makes a lot of sense and is inline with what I've read from your book/website! Still doesn't match all, but I think we're almost there, please take a look!
library(fpp3)
# Data
ts <- data.frame(
year = seq(1, 5),
asmr = c(882, 876, 879, 871, 867)
) |> as_tsibble(index = year)
T_len <- nrow(ts)
# A) MEAN
model <- ts |>
mutate(.fitted = mean(asmr)) |>
mutate(.resid = asmr - .fitted)
k <- 1
o <- sqrt(sum(model$.resid^2, na.rm = TRUE) / (T_len - k))
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = model$.fitted[1],
h = year - T_len
) |>
mutate(.var = (o * sqrt(1 + 1 / T_len))^2)
fc_fabletools <- ts |>
model(MEAN(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
all.equal(fc$.var, fc_fabletools$.var) # TRUE
# B) NAIVE
model <- ts |>
mutate(.fitted = lag(asmr, 1)) |>
mutate(.resid = asmr - .fitted)
k <- 1 # Shouldn't this be zero?
o <- sqrt(sum(model$.resid^2, na.rm = TRUE) / (T_len - k))
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = model$asmr[T_len],
h = year - T_len
) |>
mutate(.var = (o * sqrt(h))^2)
fc_fabletools <- ts |>
model(NAIVE(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
all.equal(fc$.var, fc_fabletools$.var) # TRUE
# C) TSLM
model <- ts |>
mutate(.fitted = mean(asmr)) |>
mutate(.resid = asmr - .fitted)
k <- 1
o <- sqrt(sum(model$.resid^2, na.rm = TRUE) / (T_len - k))
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = model$.fitted[1],
h = year - T_len
) |>
mutate(.var = (o * sqrt(1 + 1 / T_len))^2)
fc_fabletools <- ts |>
model(TSLM(asmr)) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
all.equal(fc$.var, fc_fabletools$.var) # TRUE
# D) TSLM/trend
fit <- lm(asmr ~ year, data = ts)
model <- ts |>
mutate(.fitted = fit$coefficients[2] * year + fit$coefficients[1]) |>
mutate(.resid = asmr - .fitted)
k <- 1
o <- sqrt(sum(model$.resid^2, na.rm = TRUE) / (T_len - k))
fc <-
data.frame(year = seq(6, 10)) |>
mutate(
.fitted = fit$coefficients[2] * year + fit$coefficients[1],
h = year - T_len
) |>
mutate(.var = (o * sqrt(h * (1 + h / (T_len - 1))))^2)
fc_fabletools <- ts |>
model(TSLM(asmr ~ trend())) |>
forecast(h = 5) |>
mutate(.var = distributional::variance(asmr))
all.equal(fc$.var, fc_fabletools$.var) # FALSE
Four cases:
A) MEAN
PASS
B) NAIVE
PASS - but only with k=1, shouldn't it be 0, as you wrote above?
C) TSLM
PASS
D) TSLM/trend
FAIL - tested k=1 and 0, both fail... not sure why?
I think the issue might be here: mutate(.var = (o * sqrt(h * (1 + h / (T_len - 1))))^2)
- the implementation of the 'h-step forecast standard deviation' - it should match table 5.2 in your book?
B) T is the number of non-missing values. With NAIVE
, there is one non-missing value due to the first observation not having an available lag.
D) You are using the variance for a drift model, which is not the same as the variance of a linear time trend. The expression for a trend variance is given at the bottom of https://otexts.com/fpp3/regression-exercises.html
Ok, thanks. Makes sense.
I've looked at the formula, and it seems that D) is not as straight forward to manually verify then...
Feel free to close this issue. Thanks again!
When manually calculating the 95%PI with the formulas described here: https://otexts.com/fpp3/prediction-intervals.html
For the TSLM/trend method - I am seeing a slight discrepancy between my excel calculations and the R fable package.
Here's a minimal reproducible example:
Output:
Sheet: https://docs.google.com/spreadsheets/d/1hbzpwZH8ctNWZ0PSQIZ8bUBlbJLC3DhQGne8UImOZEQ/edit#gid=409932088
Note, that I have similarly verified the MEAN and NAIVE methods without problems. Both the residuals and
o
value match, but the resulting Prediction Intervals are slightly different.I would appreciate it if anyone could please verify this. Thanks