ngreifer / cobalt

Covariate Balance Tables and Plots - An R package for assessing covariate balance
https://ngreifer.github.io/cobalt/
73 stars 11 forks source link

Error with bal.tab when using the addl argument #71

Closed eipi10 closed 9 months ago

eipi10 commented 1 year ago

I'm getting an error that occurs with my real data, but that I can't reproduce in a small example. I'm running bal.tab on a matchit object. It works fine if I just do bal.tab(m.out), which produces the expected balance table. But if I use the addl argument to add an additional variable or variables that were not included in the matching, I get an error. The workflow looks like this:

m.out = matchit(treat ~ gpa + gender + eth + mother.ed + units.earned,
                distance="glm", method="nearest", exact=c("gender", "eth"),
                ratio=1, data=mdat)

bal.tab(m.out, addl="work.for.pay")
Error in `bal.tab()`:
! %s must have the same number of observations as %s`addl`in the original call to `matchit()`.
Run `rlang::last_trace()` to see where the error occurred.

work.for.pay is a column in mdat, the data provided to matchit, and it has no missing values. I've tried with several other variables not included in the matching formula, some categorical, some numeric, some with missing values, and some with complete cases, but I always get the same error. Do you have any ideas what's causing this and how I can avoid the error?

ngreifer commented 1 year ago

I definitely need to correct that error! I think you need to supply addl as a formula, not a string. Can you try that?

eipi10 commented 1 year ago

Maybe I'm not understanding what addl is expecting. I've tried the following and they all fail with the same error:

bal.tab(m.out, addl= treat ~ work.for.pay)
bal.tab(m.out, addl= ~ work.for.pay)
bal.tab(m.out, addl= "treat ~ work.for.pay")
bal.tab(m.out, addl= "~ work.for.pay")
ngreifer commented 1 year ago

Yeah, this is a bug then. It expects the second option btw. Thanks for letting me know about it and sorry for the confusion. You can always use the formula interface to bal.tab(), which hopefully is less buggy:

bal.tab(treat ~ gpa + gender + eth + mother.ed + units.earned + work.for.pay,
        data = mdat, weights = m.out)
eipi10 commented 1 year ago

Actually, I just re-read the help for bal.tab.matchit and it does say that for addl one can provide a "character vector containing their names", so it seems like it should also work with the character string.

ngreifer commented 1 year ago

Yup, it's a bug.

eipi10 commented 1 year ago

I'm both pleased and mortified to report that this is not a cobalt bug. I was accidentally removing the relevant additional columns from the data frame I passed to matchit. But then because I had removed a few rows of data for the "Nonbinary" students (as we discussed in my previous issue), when I tried using an external data frame, it didn't have the correct number of rows. This was all going on in a function I wrote that obviously needs a few more tests to make sure it's doing what I think it's doing. So, it was operator error. Sorry to have wasted your time.