ngreifer / cobalt

Covariate Balance Tables and Plots - An R package for assessing covariate balance
https://ngreifer.github.io/cobalt/
73 stars 12 forks source link

bal.plot() doesn't plot binary categorical variables correctly #48

Closed sbaross closed 3 years ago

sbaross commented 3 years ago

When plotting binary categorical data (eg. sex with the values "male" & "female"), bal.plot() gives a message The dropped category for [variable] will be set to NA. leading all bars to be plotted as 100%. This doesn't occur if the variable is recoded as 0/1 or if there are 3 or more possible values. I'm using MatchIt to match, I don't know if this behaviour occurs with other packages.

I'm using cobalt v4.2.4 & MatchIt v3.0.2.

binary categorical variable (male/female)

df <- tibble(sex = sample(c("male", "female"), 100, replace = T),
             group = sample(c(0, 1), prob = c(0.7, 0.3), 100, replace = T))

m.out <- matchit(group ~ sex, data = df)
bal.plot(m.out, "sex")

The dropped category for sex will be set to NA.

1

binary numeric variable (0/1)

df2 <- df %>%
        mutate(sex = recode("male" = 0, "female" = 1)

m.out <- matchit(group ~ sex, data = df2)
bal.plot(m.out, "sex")
2

categorical variable with 3 values (male/female/unknown)

df3 <- tibble(sex = sample(c("male", "female", "unknown"), 100, replace = T),
             group = sample(c(0, 1), prob = c(0.7, 0.3), 100, replace = T))

m.out <- matchit(group ~ sex, data = df3)
bal.plot(m.out, "sex")
3
ngreifer commented 3 years ago

Thank you for letting me know about this! I have fixed it in the development version which you can install from GitHub. It turns out I had already built in a fix for this but forgot to finish the implementation, so it was a quick fix.