robjhyndman / forecast

Forecasting Functions for Time Series and Linear Models
http://pkg.robjhyndman.com/forecast
1.13k stars 342 forks source link

Degrees-of-freedom adjustment in Ljung-Box test in checkresiduals() #908

Closed zeileis closed 1 year ago

zeileis commented 2 years ago

Problem: When applying the Ljung-Box test to residuals from ARMA(p, q) models with k lags, Ljung & Box (1978) and also Box & Pierce (1970) say that the degrees of freedom need to be adjusted by p+q, i.e., leading to a chi-squared distribution with k - (p + q) degrees of freedom. However, a student of mine noticed that checkresiduals() actually adjusts by p + q + 1 because forecast:::modeldf() uses the number of coefficients which by default includes the mean. See below for an illustration.

Fix: I'm not sure what is the best fix here because I haven't looked at the use of the the modeldf() extractor in the package. Either one could change the extractor or have a custom extractor function for checkresiduals(). Let me know if I can help with anything or test ideas.

Ilustration: To illustrate that this generates downward-biased p-values let's use the following simple function:

simulate_pvalue <- function(model, nobs = 250, lag = 12, ...) {
  ## simulate ARMA series
  y <- arima.sim(model = model, n = nobs, ...)

  ## fit ARMA model with mean and true model order
  p <- length(model$ar)
  q <- length(model$ma)
  m <- forecast::Arima(y, order = c(p, 0, q), include.mean = TRUE)

  ## compute p-value and suppress output
  out <- capture.output(rval <- forecast::checkresiduals(m, lag = lag, plot = FALSE))
  c(statistic = rval$statistic, p.value = rval$p.value)
}

As the correct model order is used for model fitting, the p-values from the test should be uniformly distributed under the null hypothesis. Here I just check an AR(2) model but you get analogous results for other ARMA models.

set.seed(0)
s <- replicate(5000, simulate_pvalue(model = list(ar = c(1.5, -0.75))))

First, I extract the simulated p-values from checkresiduals() and visualize the histogram (see below). This is right-skewed and consequently the ECDF at nominal 1%, 5%, and 10% levels is inflated.

p <- s["p.value", ]
hist(p)
ecdf(p)(c(0.01, 0.05, 0.1))
## [1] 0.0210 0.0836 0.1568

Using just df = 12 - 2 instead of the 12 - 3 degrees of freedom leads to a much more uniform histogram and empirical sizes that are much closer to their nominal level.

p_adj <- pchisq(s["statistic.Q*", ], df = 12 - 2, lower.tail = FALSE)
hist(p_adj)
ecdf(p_adj)(c(0.01, 0.05, 0.1))
## [1] 0.0138 0.0558 0.1080

Histograms of original and adjusted p-values from Ljung-Box test

robjhyndman commented 2 years ago

Thanks @zeileis. It seems this is a problem for more than the ARIMA models. I can fix the modeldf.Arima() function, but I'll need to think about the other model.xx() functions. I had naively assumed k could be set to the number of parameters in the corresponding model, but simulations like you've done above show this is not true.

robjhyndman commented 2 years ago

Problem resolved for ARIMA models in https://github.com/robjhyndman/forecast/commit/a2d8b3cb1c55a72253a3ee807f8334c6951c78b1

zeileis commented 2 years ago

Thanks for the feedback, Rob @robjhyndman, this all makes sense. I was also surprised because I had expected that estimation of the mean needs to be accounted for. Thanks for solving the problem for ARIMA problems now.

robjhyndman commented 1 year ago

Fixed in https://github.com/robjhyndman/forecast/commit/f72e400fdb18fdc2154c19adfc9932380e2f6c20 Now fitdf is set to 0 except for ARIMA models, in which case fitdf = p+q+P+Q.

Corresponding change made to fpp2: https://otexts.com/fpp2/residuals.html#portmanteau-tests-for-autocorrelation

zeileis commented 1 year ago

Thanks, much appreciated!