Average partial effect per cluster

grf-labs / grf

Generalized Random Forests

https://grf-labs.github.io/grf/

GNU General Public License v3.0

968 stars 250 forks source link

Average partial effect per cluster #633

Closed gcasamat closed 3 years ago

gcasamat commented 4 years ago

I have clustered data with a continuous treatment and I would like to compute the average partial effect at the cluster level. This is not possible with the current implementation of the grf software because "with clustering enabled you treat each cluster as a distinct unit, which here would be the same as asking for the average partial effect for a single observation" (see #628). I suggested in this previous issue modifying the computation of V.hat in the script average_partial_effect.R. Having received no reply to this suggestion, I presume this is not a valid way to proceed. However I would be very interested to understand why. More generally, do you have any suggestion on the way to calculate the per cluster average partial effect ?

erikcs commented 4 years ago

Sorry, issues are closed automatically when the PR is merged. average_partial_effect.R (continuous treatment) is a generalization of the average_treatment_effect.R (binary treatment), which as you see does not require that same variance estimation, but still requires more than one cluster, @swager may have a better explanation?

gcasamat commented 4 years ago

Thanks for your reply. What you said is clear to me: with the current implementation of grf, it is not possible to compute the average partial effect in a single cluster. However, it seems that such a computation (in the binary treatment case) is done in the accompanying code of the article https://arxiv.org/abs/1902.07409: school scores are computed by using formula (8) in the paper.

erikcs commented 4 years ago

You mean an ATE with only one cluster is computed here? (Sorry, I am not able to find it)

gcasamat commented 4 years ago

The histogram drawn below (taken from script.R) could not be interpreted as the distribution of ATEs per school? pdf("school_hist.pdf") pardef = par(mar = c(5, 4, 4, 2) + 0.5, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5) hist(school.score, xlab = "School Treatment Effect Estimate", main = "") dev.off()

gcasamat commented 4 years ago

To summarize, my concern is about the way the variance of W given X is estimated in average_partial_effect.R. Currently it is: variance_forest <- regression_forest(subset.X.orig, (subset.W.orig - subset.W.hat)^2, clusters = subset.clusters, num.trees = num.trees.for.variance ) Why couldn't we have: variance_forest <- regression_forest(X.orig, (W.orig - W.hat)^2, clusters = clusters, num.trees = num.trees.for.variance ), knowing that W.hat itself is estimated on the whole sample (not on the subsetted data) ?

swager commented 4 years ago

When estimating the ATE with clusters, we imagine a sampling model where we draw random clusters and so we need at least 2 clusters to get a variance estimate; see, e.g., (8) in https://arxiv.org/pdf/1902.07409.pdf where we divide by J - 1 to estimate variance, where J is the number of clusters. (The difference with the setting of the histogram you brought up is that, there, we just need point estimates without variance estimates, and so using data from just one cluster is OK.)

The point you make in your last comment is a good one -- we should provide the user more flexibility in how they get weights. Ideally, the function average_partial_effect should take an optional argument debiasing.weights (either of length n or of the same length as subset), and if this argument is non-null then we skip training the variance.forest. @erikcs could you please create an issue for this?

gcasamat commented 4 years ago

So you do not report the average partial effect for a single cluster because it is not possible to compute variance estimates (even though point estimates can be calculated). Right? More generally, the implications of clustering data when using causal forests are not entirely clear to me. Having read the various threads on the topic, it seems to be an active area of research and I believe that more insights are yet to come. Many thanks for your answer!

erikcs commented 3 years ago

Closing this issue as it seemed resolved, note that after version 1.2.0 the average_partial_effect is removed and replaced by a new unified interface: #723