jthaman / ciTools

An R Package for Quick Uncertainty Intervals
GNU General Public License v3.0
106 stars 9 forks source link

Failure in bootstrapped CI due to missing factor levels #52

Open billdenney opened 3 years ago

billdenney commented 3 years ago

When factor levels are missing from a model, no surprise, it cannot predict rows with that factor level. The issue comes from the use of predict() in those models updated without all levels. The reprex below illustrates the issue.

set.seed(5)
library(ciTools)
#> ciTools version 0.6.1 (C) Institute for Defense Analyses
library(tidyverse)

my_data <-
  data.frame(
    counts=c(18,17,15,20,10,20,25,13,12),
    outcome=gl(3,1,9),
    treatment=gl(3,3)
  )

my_model <- glm(counts ~ outcome + treatment, data=my_data)

# Works because it is not using `predict()`
my_new_data <-
  my_data %>%
  add_ci(fit=my_model)

# Fails because data subsetting is not contingent on input factor levels
my_new_data <-
  my_data %>%
  add_ci(fit=my_model, type="boot")
#> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels): factor treatment has new levels 2

Created on 2021-05-17 by the reprex package (v2.0.0)