grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
938 stars 250 forks source link

Do the causal_forest() and predict() functions technically fall under the umbrella of an "honest causal forest" or a "generalized random forest" ? #1381

Closed njawadekar closed 6 months ago

njawadekar commented 6 months ago

Hello, I am just reaching out to clarify whether the causal_forest function and the corresponding predict() function for estimating treatment effects are technically considered an implementation of an "honest causal forest" or a "generalized random forest." I am asking, because I would like to ensure that I am accurately describing the specific method (either honest causal forest or generalized random forest) that I am using in my tutorial paper (where I walk-through an implementation of the causal_forest() and predict() functions for estimating CATEs within the grf package).

As described in the paper "Generalized Random Forests" (see https://arxiv.org/pdf/1610.01271.pdf), honest causal forests are similar to generalized random forests, yet they are slightly different. Specifically, the honest causal forest uses an exact-loss criterion to perform splits and a CATE is initially estimated in each tree before getting averaged (see https://arxiv.org/pdf/1510.04342.pdf), whereas the generalized random forest uses a gradient-based criterion to perform splits and adaptive neighborhood weighting is used to estimate CATEs (see https://arxiv.org/pdf/1610.01271.pdf).

Any clarification about this topic, and whether causal_forest / predict() technically constitute an honest causal forest, or a generalized random forest, would be appreciated. Thank you.