cambiotraining / stats-glm

https://cambiotraining.github.io/stats-glm/
Other
0 stars 0 forks source link

add confidence intervals to model outcomes #1

Open tavareshugo opened 3 months ago

tavareshugo commented 3 months ago

currently, there is code to plot the fitted line (using broom::augment()), but it would be nice to also plot the confidence interval of the estimate. I'm not sure broom::augment() can do this, but here is some "manual" code for the logistic case (edit: ignore the code below, see my next comment):

library(ggplot2)

#### simulate data ####

# sample size
n_samples <- 30

# x values uniformly sampled from -1 to 1
sim <- data.frame(x = runif(n_samples, -1, 1))

# successes sampled from binomial
# model is:
# Y ~ Binom(n, p)
# logit(p) = beta0 + beta1*x
# number of trials 'n' is simulated as 10
# using beta0 = 0 and beta1 = 2
# the probability 'p' is calculated using the inverse logit (plogis)
sim$success <- rbinom(n_samples, 
                      size = 10, 
                      prob = plogis(2*sim$x))
sim$fail <- 10 - sim$success

#### fit the model ####

fit <- glm(cbind(success, fail) ~ x, data = sim, family = "binomial")

# the confidence interval of the estimate
confint(fit)

# create a table of predicted values - requires knowing about the link function (and its inverse)
fit_pred <- data.frame(x = seq(min(sim$x), max(sim$x), length.out = 500)) |> 
  transform(pred = plogis(fit$coefficients[1] + fit$coefficients[2]*x),
            lo = plogis(confint(fit)[1,1] + confint(fit)[2,1]*x),
            hi = plogis(confint(fit)[1,2] + confint(fit)[2,2]*x))

# visualise
sim |> 
  ggplot(aes(x)) + 
  geom_ribbon(data = fit_pred, 
              aes(ymin = lo, ymax = hi), alpha = 0.2) +
  geom_line(data = fit_pred, aes(y = pred)) +
  geom_point(aes(y = success/(success+fail)))

A similar thing should work for the Poisson model, using exp() as the inverse of the link function.

tavareshugo commented 3 months ago

I guess that manual approach will be harder if the model is more complex (multiple predictors, interactions, etc.).

Maybe insight::get_predicted_ci() (here) is an easier alternative (haven't tested it).