py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.76k stars 713 forks source link

Variance covariance matrix of marginal effect for multiple continuous treatment cases #721

Open tmieno2 opened 1 year ago

tmieno2 commented 1 year ago

Hi,

I have a question/request regarding the variance covariance matrix of marginal effect for multiple continuous treatment cases.

I think it is easiest to explain what I mean using an example. Under section 3 (Example Usage with Multiple Continuous Treatment Synthetic Data) of notebook titled "Double Machine Learning: Use Cases and Examples," there is an example case of multiple continuous treatments (T and T^2 to capture non-linearity of the impact of T). In this example, the main equation to estimate is

$Y = \theta_1(X)\cdot T + \theta_2(X)\cdot T^2 +$....

In the code, the standard error of the marginal effect of individual treatment are obtained using the const_marginal_effect_interval method after the model is fit. What I would like to know is if there is any way of accessing the variance covariance matrix of $\theta_1(X)$ and $\theta_2(X)$ instead of getting se individually. The variance covariance matrix would be necessary if your ultimate interest lies in the quantity represented by both $\theta_1(X)$ and $\theta_2(X)$. For example, you may be interested in knowing the marginal impact of $T$ when $T = 4$ conditional on $X$ and $W$. Then the quantity of interest is

$\frac{\partial Y}{\partial T} = \theta_1(X) + 2 \times \theta_2(X)\times 4$

In order to get the standard error of this, you need the variance covariance matrix of $\theta_1(X)$ and $\theta_2(X)$, not just se for each of them.

I cannot seem to find any method that can get the variance covariance matrix like this at the moment. If I am simply missing this functionality, could you point me to the right method? If it is not currently available, it is possible to make it available?

Thanks

kbattocchi commented 1 year ago

For the specific marginal effect computation you're interested in, that should be what's provided by marginal_effect_interval (when there is a treatment featurizer, marginal_effect no longer exactly coincides with const_marginal_effect precisely because marginal effect takes the Jacobian into account as desired here).

Offhand, I don't believe that we provide direct access to the raw covariance matrix itself, but that is something that we can consider adding to a future version of the library.

tmieno2 commented 1 year ago

Ah, yes. I should not have used that as an example to show why var-cov would be nice to have. Let's say you are interested in what level of T maximizes Y. In that case, you have non-linear function of $\theta_1(X)$ and $\theta_2(X)$. In this case, you may be interested in using the delta method to estimate se of that quantity, which require var-cov matrix.

In any case, it does not hurt for the user to be able to get var-cov matrix, and I am guessing this is not very hard for you to implement such a method.

Offhand, I don't believe that we provide direct access to the raw covariance matrix itself, but that is something that we can consider adding to a future version of the library.

That would be really appreciated.