Questions about causal forests

marclet commented 1 year ago

Hi guys,

and thank you for your amazing work. I am writing to ask a couple of key questions on the use of causal forests in a setting with panel data, binary outcome, and a continuous, plausibly exogenous treatment. Here they are:

I understand orthogonalization is key. But in my case, the treatment is a natural disaster which is plausibly exogenous and unrelated to unit-specific characteristics. Should I orthogonalize only the outcome in this case, or just put W.hat = 0.5 as it is done with RCT data? If I have to apply orthogonalization despite the plausibile treatment exogeneity, given that the treatment is continuous, would the regression forest provide me with a generalized propensity score à la Hirano-Imbens (2004)?
As mentioned above, I have panel data. However, I am not interested in getting CATEs varying depending on unit and time fixed effects. I only want to 'filter' my outcome and treatment variables from these fixed effects in the orthogonalization step and then work on the residuals. Is there any theoretical reason which should prevent me from using different sets of Xs in the two stages (namely, a larger set including unit and time FEs in orthogonalization, and a smaller one - including only covariates for which I suspect HTE) in the causal forest analysis?

Thank you very much in advance.

erikcs commented 1 year ago

Hi @marclet, yes, a continuous W implies regression_forest estimates a generalized propensity score. If you know what the generalized propensity score is, then sure, you can supply it, though I don't know what it would be in your setting (in a RCT, setting W.hat=p make sense if all units have P[W=1]=p and you know p). Using different X's for Y.hat, W.hat, and causal forest can also be perfectly reasonable, intuitively the first X's are those you think matter for doing a regression adjustment while the X's causal forest use on the residuals are the ones you might think matter for heterogeneity (the beginning of this overview have a brief walkthrough of how you can think of orthogonalization as non-parametric regression adjustment, it's the same partially linear model that is estimated when W is continuous)

marclet commented 1 year ago

Dear Erik,

that's perfectly clear. Thank you very much!!

marclet commented 1 year ago

Dear Erik/grf Team,

my apologies if I bother you again for some remaining doubts regarding the above application:

1) When I run test_calibration, the test suggests there is significant heterogeneity (differential.forest.prediction > 1 and statistically significant), but the value of mean.forest.prediction is constantly below 1 (ranging between 0.5 and 0.85 depending on the specification). I have read that the value should be around 1 if the algorithm worked fine, so I am struggling to understand how to fix the problem or if the issue is merely due to the fact I have a continuous treatment;

2) Still regarding test_calibration: in a placebo test I am running (with randomly re-assigned treatment) I find no effect and no heterogeneity - as expected - but the value of differential.forest.prediction now is below 1. Can this happen? Is it a problem?

3) Lastly, I have read various threads here about how to best incorporate FEs in causal forests. I understand this is an active area of research. In my case, I simply need to residualize my outcome from unit FEs as I am not interested in unit-specific treatment heterogeneity. What i have done is simply to include the unit dummies in the regression_forest model via one-hot encoding. Is it OK to do so? Or does the forest get unstable with so many dummies?

Thank you very much in advance for your help guys, that's the last time I bother you.

grf-labs / grf

Questions about causal forests #1294