ngreifer / cobalt

Covariate Balance Tables and Plots - An R package for assessing covariate balance
https://ngreifer.github.io/cobalt/
73 stars 11 forks source link

Unusual results with CBPS and sample weights #7

Closed kkranker closed 6 years ago

kkranker commented 6 years ago

I have constructed CBPS object using CBPS(.... , sample.weights=mydata$myweight). Afterward, cobalt's bal.tab() function gives odd results. For example, the means from bal.tab() don't match the means from balance() function from the CBPS package. I don't think bal.tab() is incorporating the sample.weights appropriately. Any suggestions?

Here's some example code to reproduce the problem:

library(CBPS)
data(LaLonde)
LaLonde$wgt <- rnorm(rep(1,nrow(LaLonde)), mean = (rep(1,nrow(LaLonde))+LaLonde$treat*.5), sd = .05)
fit <- CBPS(treat ~ age + educ + re75 + re74 + I(re75==0) + I(re74==0), data = LaLonde, ATT = TRUE, sample.weights = LaLonde$wgt)
balance(fit)
library(cobalt)
bal.tab(fit, disp.means = TRUE)

I have CBPS version 0.17 and cobalt version 3.2.0.

ngreifer commented 6 years ago

Hi Keith,

Thank you so much for letting me know about this. It is indeed a bug with cobalt. cobalt currently does not recognize sampling weights from CBPS objects. I will change this and ensure it works in the next update.

However, I was able to replicate the results from CBPS::balance() using bal.tab() on a weightit object. WeightIt is a package that provides a single interface to many other R packages for estimating weights, including CBPS, extends the capabilities of those packages, and is compatible with cobalt. The following code provides exactly the same input and estimates the same weights that your call to CBPS does:

fit.w <- weightit(treat ~ age + educ + re75 + re74 + I(re75==0) + I(re74==0), 
                  data = LaLonde, estimand = "att", s.weights = LaLonde$wgt,
                  method = "cbps")

Using bal.tab() on fit.w gives the correct results that match with CBPS::balance():

bal.tab(fit.w, disp.means = TRUE)

Note that the "unadjusted" values are actually adjusted with just the sampling weights, and the "adjusted" values are adjusted with both the sampling and balancing weights. As with CBPS objects, you can use get.w(fit.w) to extract the combined balancing and sampling weights from the weightit object.

kkranker commented 6 years ago

Okay -- thanks for the quick reply and explaining the workaround with weightit().

ngreifer commented 6 years ago

Issue fixed. Summary statistics should be correct now. To get defaults to work correctly, you should include sampling weights in the call to bal.tab(), as in

bal.tab(fit, s.weights = LaLonde$wgt)

I'm closing this issue now. Thank you for raising it.

kkranker commented 6 years ago

Thanks!