vdorie / bartCause

Causal Inference using Bayesian Additive Regression Trees
77 stars 10 forks source link

Individual-Level Treatment Effects #2

Open ferlocar opened 5 years ago

ferlocar commented 5 years ago

Hi!

Great package. Thanks for pulling this off. I was wondering is there is a way to use this package to make individual level treatment effect predictions. That is, I want to know whether I can use historical data to estimate the BART model and then use the estimated model to predict the treatment effect of new individuals based on their features.

vdorie commented 5 years ago

Not easily, at least at the moment. Eventually I'll write a predict function but I'm currently working on overhauling how base-BART stores trees. You can kludge something together by running code such as:

# takes as input:
#    y.train, z.train, x.train - training response, treatment, and confounders
#             z.new,   x.new   - out of sample treatment and confounders
fit <- bartc(y.train, z.train, x.train,
             method.trt = "bart", keepTrees = TRUE)

pscore.new <- apply(predict(fit$fit.trt, x.new), 3, mean)
# get predictors in format response model uses
x.test <- cbind(x.new, z = 1, ps = pscore.new)

# add in control observations
x.test <- rbind(x.test, x.test)

trtRows <- seq_len(nrow(x.new))
ctlRows <- seq.int(nrow(x.new) + 1, nrow(x.test))

x.test[ctlRows,"z"] <- 0

# might emit a warning, but should be harmless
pred <- predict(fit$fit.rsp, x.test)

mu.hat.1 <- pred[,,trtRows]
mu.hat.0 <- pred[,,ctlRows]
ite <- apply(mu.hat.trt - mu.hat.ctl, 3, mean)
ferlocar commented 5 years ago

Hi Vincent,

Thanks for the quick reply and your help. I tried to run your code but it seems that the cbind to create x.test is failing. It seems that the error comes from the fact that pscore.new consists of a matrix of three dimensions. Is that part of the intended behavior of 'predict(fit$fit.trt, x.new)'? What am I missing?

Thanks again for your help!

vdorie commented 5 years ago

Whoops. I've edited the code block to take the average.

ferlocar commented 5 years ago

Thanks Vincent, works great!

Just one last clarification question. If I'm looking at the individual-level treatment effects in order to make prescriptive decisions (e.g., deciding which individuals should receive the treatment), then I would have to do the following:

(1) Compute individual predictions for the new observations with z.new=1 for everyone (2) Compute individual predictions for the new observations with z.new=0 for everyone (3) Compute the individual average treatment effect as a weighted average of (1) and (2), where the weights correspond to the percentage of treated and untreated in the training sample.

Is this correct?

vdorie commented 5 years ago

I'm not sure that you need to weight. It really depends on what task you're trying to perform. If the x.new you're plugging in corresponds to some real individuals, the ite variable at the end can simply be ordered/sorted/ranked. It is an estimate of $E[Y_i(1) - Y_i(0) \mid X = x_i]$. It's also a posterior mean so it doesn't include uncertainty, but you could instead estimate for all of the individuals the probability that they're above some specific effect size.

ferlocar commented 5 years ago

I'm a little confused by your answer. If the estimates in ite are indeed $E[Y_i(1) - Y_i(0) \mid X = x_i]$, i.e. an estimate of the average treatment effect given X, then what is the role of z.new in the above? What difference does it make to set z.new to either 1 or 0 when estimating the individual-level treatment effects?

vdorie commented 5 years ago

You're right, z.new isn't actually needed. I've updated the snippet to reflect that.

ferlocar commented 5 years ago

Great, thanks a lot Vincent!

ferlocar commented 5 years ago

Just one last comment. I think there's a typo in the last line. It should be: ite <- apply(mu.hat.1 - mu.hat.0, 3, mean)

ferlocar commented 5 years ago

Hi Vincent,

I'm using (again) the code you posted to compute individual-level treatment effects. I noticed that the individual-level effects that I obtain do not make much sense (they are very far off from the average treatment effect). Is it possible that this code is outdated given the recent updates made to dbarts? In particular, I noticed that pscore.new returns negative values, which doesn't make sense if these are supposed to be interpreted as propensity scores.

Could you please advise?