Use of the rotation argument in simulateResiduals()

florianhartig commented 1 year ago

Via email:

I have a general question about what to supply for the rotation argument in simulateResiduals.

I would like to use this argument since I read in the help pages about it as a possibility to correct for autocorrelation when the model doesn't support conditional residuals (I am using glmmTMB). I would like to assess if the RE structure / temporal correlation structure in my model is properly accounting for the temporal autocorrelation that is present.

I checked vcov(model) and VarCorr(model) and I see that the latter does provide matrices for the random effects structure but I am not sure how to translate this into an entire matrix for the residuals; I see that simulateResiduals requires a matrix of the same dimensions as the residuals themselves in order to use the rotation argument.

florianhartig commented 1 year ago

Hello,

first of all, for the background on the rotation implementation, which is still experimental, see #301 in addition to the help files.

About your question: if you have several observations per time step, you have to aggregate the residuals to one residual per time step (see help of testTemporalAutocorrelation). Then, you should be able to rotate according to the RE covariance.

Make sure that you supply the covariance in the right format, see help in testTemporalAutocorrelation.

Best, Florian

gabriellabishop commented 1 year ago

Thanks so much Florian! I've tried it out (although I didn't really understand what was going on in issue #301 / the example in the testTemporalAutocorrelation help). Could you confirm whether this seems correct?

model <- glmmTMB(RV ~ IV + (1 | RE1) + (1 | RE2) + ou(times + 0 | dummygroup), data=data, family=nbinom2)
res <- simulateResiduals(model)
cov <- VarCorr(model)
cov <- cov$cond$dummygroup #matrix for times, [296x296]
res2 <- recalculateResiduals(res, group=data$times, rotation=cov) 
#no error message is thrown, even though the grouped residuals are [296x1]

When I use testTemporalAutocorrelation with res2, I do get different results than when I calculate the residuals without the rotation argument (although sadly there is still a significant lag).

What I didn't understand from the example code in the help page was that everything was just simulated and not coming from the fittedModel, so I wasn't sure how to implement this exactly. Any further insight would be great. Thanks!

Editing to add: I noticed you added recommendation in the help pages to just use the rotation="estimated" option for glmms since the covariance matrix on the response scale is unavailable. This was of course simple for me to implement, and it indicated that my temporal autocorrelation was taken care of. Was my interpretation of this text in the help page correct?

florianhartig commented 1 year ago

Hello Gabriela,

that looks exactly right to me. Below I have a simplified example using simulated data, which shows that this should work in principle.

no error message is thrown, even though the grouped residuals are [296x1]

Which should be the number of time steps you have, right?

When I use testTemporalAutocorrelation with res2, I do get different results than when I calculate the residuals without the rotation argument (although sadly there is still a significant lag).

Yes, as you should! You should only consider the rotated residuals, as the whole point of this exercise is that the unrotated ones are correlated and thus potentially unreliable.

What I didn't understand from the example code in the help page was that everything was just simulated and not coming from the fittedModel, so I wasn't sure how to implement this exactly. Any further insight would be great. Thanks!

The example in the help was not very good - I hope the example below is better.

Editing to add: I noticed you added recommendation in the help pages to just use the rotation="estimated" option for glmms since the covariance matrix on the response scale is unavailable. This was of course simple for me to implement, and it indicated that my temporal autocorrelation was taken care of. Was my interpretation of this text in the help page correct?

I wanted to point this out anyway - for a GLMM, the estimated covariance is at the linear scale, but the DHARMa residuals are necessarily calculated at the response scale. Thus, the residual covariance is further distorted by the link function. As this distortion is non-linear, there is no "perfect" rotation for GLMMs, but possibly the covariance estimated from the response scale will be more appropriate than the covariance provided by the model (at least, that was my idea when implementing this).

UPDATE: you can see this problem in my updated simulations!

  set.seed(123)

  # Gaussian error 

  # Create AR data with 5 observations per time point
  n <- 100                                              
  x <- MASS::mvrnorm(mu = rep(0,n),
                     Sigma = .9 ^ as.matrix(dist(1:n)) )   
  y <- rep(x, each = 5) + 0.2 * rnorm(5*n)                       
  times <- factor(rep(1:n, each = 5), levels=1:n)
  levels(times)
  group <- factor(rep(1,n*5))
  dat0 <- data.frame(y,times,group)

  # fit model
  model = glmmTMB(y ~ ar1(times + 0 | group), data=dat0)

  # Standard residuals show spurious problems because of autocorrelation
  res <- simulateResiduals(model)
  plot(res)

  # grouped according to times, unrotated shows autocorrelation
  res2 <- recalculateResiduals(res, group=dat0$times) 
  testTemporalAutocorrelation(res2, time = 1:length(res2$scaledResiduals))

  # extract estimated AR1 covariance
  cov <- VarCorr(model)
  cov <- cov$cond$group # extract covariance matrix of REs

  # grouped according to times, rotated with estimated Cov - how all fine!
  res3 <- recalculateResiduals(res, group=dat0$times, rotation=cov) 
  plot(res3)
  testTemporalAutocorrelation(res3, time = 1:length(res2$scaledResiduals))

  # Poisson error 
  # note that for GLMMs, covariances will be estimated at the scale of the 
  # linear predictor, while residual covariance will be at the responses scale
  # and thus further distorted by the link. Thus, for GLMMs with a nonlinear
  # link, there will be no exact rotation for a given covariance structure

  set.seed(123)

  # Create AR data with 5 observations per time point
  n <- 100                                              
  x <- MASS::mvrnorm(mu = rep(0,n),
                     Sigma = .9 ^ as.matrix(dist(1:n)) )   
  y <- rpois(n = n*5, lambda = exp(rep(x, each = 5)))                     
  times <- factor(rep(1:n, each = 5), levels=1:n)
  levels(times)
  group <- factor(rep(1,n*5))
  dat0 <- data.frame(y,times,group)

  # fit model
  model = glmmTMB(y ~ ar1(times + 0 | group), data=dat0, family = poisson)

  res <- simulateResiduals(model)

  # grouped according to times, unrotated
  res2 <- recalculateResiduals(res, group=dat0$times) 
  testTemporalAutocorrelation(res2, time = 1:length(res2$scaledResiduals))

  # grouped according to times, rotated with estimated Cov - problems remain
  cov <- VarCorr(model)
  cov <- cov$cond$group # extract covariance matrix of REs
  res3 <- recalculateResiduals(res, group=dat0$times, rotation=cov) 
  testTemporalAutocorrelation(res3, time = 1:length(res2$scaledResiduals))

  # grouped according to times, rotated with covariance estimated from residual 
  # simulations at the response scale
  res4 <- recalculateResiduals(res, group=dat0$times, rotation="estimated") 
  testTemporalAutocorrelation(res4, time = 1:length(res2$scaledResiduals))

florianhartig commented 1 year ago

p.s. - not sure if you have already looked at this, but I just updated the example for the help, with a stronger value for the AR1 covariance, the problem with the link function in the Poisson becomes visible!

gabriellabishop commented 1 year ago

Many thanks Florian!

florianhartig commented 1 year ago

OK, this has been added to the help, will be rolled out with the next DHARMa release.

florianhartig / DHARMa

Use of the rotation argument in simulateResiduals() #364

no error message is thrown, even though the grouped residuals are [296x1]