Open ludovico-lanni opened 6 months ago
For most of our estimators, estimation is centered on providing the CATE and the mechanics of the estimation process do not automatically result in the computation of an ATE of the training population at the same time. We provide the ate
, ate_interval
, and ate_inference
methods as a convenience to compute the ATE averaged over any population by taking the CATE estimates for that population and averaging them; ate_interval
provides confidence intervals, and ate_inference
provides not only confidence intervals but also p-values, etc., and so you can get what you're after by using the same set of Xs as an argument to ate_inference
as you used when training the estimator, although this will not necessarily be a very precise estimate of the ATE.
One exception to this general rule is CausalForestDML
, which does compute a doubly-robust estimate of the ATE as part of the estimation process (if drate=True
, which it is by default) - to access it, use the ate_
attribute or the ate__inference
method (with an extra underscore compared to the standard method). This should give a more precise estimate with tighter confidence intervals compared to the approach that averages CATEs.
Hello!
I have been using the EconML library for some time now and I am not sure what is the way to use a DML object to make inference about the ATE, without conditioning the results on a given set of features X.
All the methods that I've seen in the docs, like
ate()
,ate_interval()
require an X whenever that X is used in the fitting process. What they return is CATE, conditioned on X. However, imagine I want to use DML methods to reduce the variance of a causal estimator that I want to use on experimental data (where the treatment is randomised), and even tho I surely want to add interactions with a set of controls X and non-linearity conditions in the model (that's why using lasso or non-parametric DML), I am anyways interested in just one summarising number (the ATE) and its confidence interval.I guess that I can get the ATE by doing the mean of the CATE calculated on my sample X. But what about the confidence interval of the ATE?