Open MarcRieraDominguez opened 1 year ago
@MarcRieraDominguez this is a tricky problem. Note that the loglikelihood for exactly the same model is not the same when you use the cbind frequency form as when you base it on the original data. As the PseudoR2 relies on the loglikelihood it yields to different results in dependency of the formulation of the model.
library(DescTools)
library(faraway)
# Binomial GLM: k success out of K trials
data(troutegg, package="faraway") # Survival of trout eggs
m1 <- glm(cbind(survive,total-survive) ~ location+period, family=binomial, troutegg)
summary(m1)
PseudoR2(m1)
1 - logLik(m1)/logLik(update(m1, . ~ 1))
d.set <- DescTools::Untable(xtabs( cbind(survive, total-survive) ~ location + period, troutegg))
colnames(d.set)[3] <- "survive"
levels(d.set$survive) <- c("yes", "no")
d.set$survive <- relevel(d.set$survive, ref = "no")
m2 <- glm(survive ~ location+period, family=binomial, d.set)
summary(m2)
PseudoR2(m2)
1 - logLik(m2)/logLik(update(m2, . ~ 1))
Nevertheless we will look into this issue!
Thank you for your reply! I had no idea the log-likelihood could depend on model formulation, good to know!
Hi! Congratluations on the great package! I have come across an unexpected behaviour of the
PseudoR2
function: it returns a wrong value of McFadden's Pseudo-R2 for binomial GLMs that model successes out of failures. It appears to work fine with binomial GLMs that model binary responses. I provide a reprex with overdispersed binomial GLM, it shows the same warning as my own data which are not overdispersed:In res["Tjur"] <- unname(diff(tapply(y.hat.resp, y, mean, na.rm = TRUE))) : number of items to replace is not a multiple of replacement length
Previous reported issues with PseudoR2 did not deal with binomial GLMs explicitly, as far as I can tell: https://github.com/AndriSignorell/DescTools/issues/19 Thank you for your time!!Created on 2023-10-04 with reprex v2.0.2