Advice: multiple distinct treatments vs. multiple treatment categories

corydeburd commented 10 months ago

I could use some advice on a setting with multiple, distinct treatments. As an example, let's say I make 5 binary choices T_1, T_2, ... , T_5 \in {0,1}.

Is the best GRF choice for this lm_forest?
If yes, is it appropriate to add the treatment categories in as predictors, i.e., in the X matrix? For example, I want to learn if T_1=1 changes the treatment effect for T_2
Am I correct that this setup is not designed for multi_arm_causal_forest? That is, the latter function still can only take in a vector not a matrix of treatments? In theory, I could create 32=2^5 distinct treatment categories, but this likely throws out a lot of information, e.g., {1,1,0,0,0} is likely somewhat similar to {1,1,1,0,0} and shouldn't be considered totally distinct categories

Thanks for the help!

erikcs commented 10 months ago

Hi @corydeburd, If your binary T_1, ..., T_5 were mutually exclusive, then you could use multi_arm_causal_forest to estimate contrasts of the form {E[Y(2) - Y(1) | X], E[(Y(3) - Y(1) | X], ..., E[Y(5) - Y(1) | X]} where Y(k), k=1,...5, are potential outcomes corresponding to the 5 treatment arms (where for example arm 1 could be the control arm). It sounds like you have a factorial design, I think in this paper https://arxiv.org/abs/2212.13638 the authors used causal forest on all ~40 possible treatment arm combinations if that could be of any inspiration.

corydeburd commented 10 months ago

Thank you for this response. You're right, it's a factorial design -- k separate binary treatments so 2^k possible combinations. The linked article definitely makes sense for us to consider and (as best as I can tell) it does seem to just treat these as separate options, as you say. So this may be the standard.

If that's right, I think we might stick to just estimating the effects of one treatment (say T_1) where i=2, ... k treatments are included as predictors. With infinite data, we could simply code all 2^k combinations as the above suggests, but it does seem like this removes a lot of information as k grows. In our setting, we're looking at network effects, so we want to, say, fit a model for you + (k-1) of your neighbors' treatments. k=5 makes sense, but we could look at k=2 or k=10 as well.

I was hoping to coerce the lm_forest into doing this by putting T_1, T_2, ... T_k as predictors in addition to treatments. However, although it actually produces sensible results in our case, I was worried this was not kosher. For example, a policy tree could potentially say something nonsensical like "if T_1 =0 [as a predictor], assign T_1=1 [as a treatment]." If that's not the approach adopted in the above paper, I am doubly worried.

Let me know if the above makes sense & thanks again for your help. A factorial version of GRF would be great fun if one ever comes out!

selma33511 commented 2 months ago

@erikcs

Hello, I ran into this post while grappling with a similar kind of question.

I'd like to know if it is possible to impute the heterogeneous effect in a factorial design setting in the current version of grf.

My experiment design has a 2 (T_1) x 2 (T_2) factorial design; As you can see, this setting allows me to know whether the treatment effect of T_1 affects the marginal effect of T_2 (in other word of whether the change from 0 to 1 in T_1 makes a significant difference when the T_2's values change from 0 to 1.) Ultimately, I'd like to know which subgroup of a covariate predicts a stronger degree of this combination of treatment effects.

I've read the paper mentioned above, but the code in that paper is too difficult for me... I would greatly appreciate it if you could reply.

grf-labs / grf

Advice: multiple distinct treatments vs. multiple treatment categories #1330