florianhartig / DHARMa

Diagnostics for HierArchical Regession Models
http://florianhartig.github.io/DHARMa/
211 stars 22 forks source link

Add support for ordinal / clmm2 #34

Open florianhartig opened 7 years ago

florianhartig commented 7 years ago

Request see discussion here https://twitter.com/marcnicklas/status/910423954464092160

Perspective for this request: afaiks, this package does not yet include a simulate function, which is a minimum requirement for inclusion in DHARMa. I am reluctant to implement such a function myself, with the exception of essential packages, because it is difficult for me to guarantee compatibility with future updates. The respective model package is a better place for such a function. Feel free to request a simulate function from the package developers. Once such a function is available, I am more than willing to include the package to DHARMa.

Interim solution for users that want to use DHARMa with glmmTMB: take your fitted model, create a simulate function for this model structure yourself, and then use createDHARMa (see help), this will allow most options of the package to be run.

florianhartig commented 7 years ago

So, to make this work, we would need to write a simulate function for clmm2, some hints here

I currently don't have the time to do this, so I have to put this on the stack, but if you can simulate from your model, you can read in the simulations via createDHARMa, set integer = T, and I think this should work (should of course test this on simulated data)

JeremyGelb commented 4 years ago

Hi florianhartig,

I discovered DHARMa recently and I really appreciate the work you put on it. Regarding the questions of multinomial variable, I was wondering: if I fit a multinomial logistic model, would it make sense to analyze the residuals as if they come from a set of binomial logistic models? Let us say that I have a fitted model called model, I could:

  1. extract with a loop the predicted probabilities of each category (minus the one used as a reference)
  2. extract the observed value as zeros (reference category) and ones (other category)
  3. use the probabilities in 1) to simulate observations with rbinom
  4. combine all of these and plug it in createDHARMa

some code to illustrate :

# we have 4 categories A,B,C,D and A is the reference
categories <- c("B","C","D")
nsim <- 1000
#extracting the prediction on the link scale
predicted <- predict(model, type = "link")
ilink <- function(x){exp(x)/(1+exp(x))}

# looping for each category
data_sims <- lapply(1:ncol(predicted),function(i){
  categorie <- categories[[i]]
  # data is the original dataset, filtering for the actual category
  test <- data$Y %in% c("A",categorie)
  values <- predicted[test,i]
  probs <- ilink(values)
  real <- data[test,]$Y
  real <- ifelse(real=="A",0,1)
  all_probs <- cbind(1-probs,probs)
  return(list("real" = real, probs = all_probs))
})

# simulating data 
all_probs <- do.call(rbind, lapply(data_sims, function(i){i$probs}))
all_real <- do.call(c, lapply(data_sims, function(i){i$real}))

simualtions <- lapply(1:nrow(all_probs), function(i){
  probs <- all_probs[i,]
  sims <- sample(c(0,1), size = nsim, replace = T, prob = probs)
  return(sims)
})

matsim <- do.call(rbind, simualtions)

sim_res <- createDHARMa(simulatedResponse = matsim, 
                            observedResponse = all_real,
                            fittedPredictedResponse = all_probs[,2],
                            integerResponse = T)

Would this approach be appropriate?

Thank you for your time and your help !