hassan-obeid / tr_b_causal_2020

Causal inference conference - June 2020
MIT License
0 stars 0 forks source link

Selection-on-observables Simulation Goals Revisited #57

Closed timothyb0912 closed 4 years ago

timothyb0912 commented 4 years ago

Hey @bouzaghrane and @hassanobeid1994

I'm going to just jump straight to the topic at hand.

The issue

As I mentioned via text message, I think that my original idea—the idea driving the work that @bouzaghrane is doing—was wrong.

To recap, I thought that by manipulating the causal graph generating one’s explanatory variables, we could bias our parameter estimates in one’s outcome model.

Hassan raised the assertion that, no, as long as one’s outcome model is correctly specified, its parameters will be estimated unbiasedly, regardless of the generating model for X.

Though I originally disagreed, I now think Hassan was completely right and that my thought above was wrong.

My rationale

Here is why I think Hassan is right and my earlier views in this issue are wrong.

The entire concept of exogeneity as it relates to causal inference is built around being able to infer the outcome model without needing to make inferences on the assignment model.

As explained by Zhang, Zhang, and Scholkopf (2015), “the concept of exogeneity formalizes the idea that the mechanism generating the exogenous variable X does not contain any relevant information about the parameter set, phi, for the conditional model P(Y |X)”.

In other words, you can estimate the parameters of the outcome model without using any information from the model that generates X.

As shown in the Figure 1a of Zhang, Zhang, and Scholkopf (2015), this is precisely the unconfounded scenario.

My Analysis

I think we ended up in this situation (with me holding erroneous ideas and @bouzaghrane perhaps doing some unnecessary work) for two reasons.

Reason one is that I never formally / explicitly justified why I thought my earlier views were correct.

Doing so would have, with higher probability, exposed the flaws in my thinking. In particular, my original logic was as follows.

Let A = Manipulating one’s causal graph, B = Multicollinearity and C = “serious problems” during outcome model estimation of GLMs.

Then since I know A can cause B and B can cause C, I concluded A can cause C.

The problem with this logic is that I didn’t explicitly define what these “serious problems” were. For some reason, I thought the “serious problem” was the introduction of parameter bias during estimation.

This is incorrect.

Multicollinearity inflates parameter’s standard errors and posterior dispersions, leading to parameter inferences that are close to the prior. In other words, multicollinearity does not lead to parameter bias, only statistical inefficiency.

As a result, we should not expect changing one’s causal graph to lead to parameter bias because multicollinearity does not lead to parameter bias.

Reason two is that I misunderstood the difference between a biased estimate of a causal effect and a biased estimate of a model parameter. Based on Keele, Stevenson, and Elwert (2020) and Pearl (2007), I knew that you couldn’t interpret or use a regression coefficient as a causal effect estimate unless that causal effect was identified in one’s causal graph. Where I erred was in thinking that the estimated regression coefficient was not a “good” (i.e. unbiased / consistent) estimate of the data generating parameter, simply because it was not a “good” (i.e. unbiased / consistent) estimate of a causal effect or could not be directly used to compute one's causal effect estimate without simulation of one's explanatory variables.

Keele, L., Stevenson, R. T., & Elwert, F. (2020). The causal interpretation of estimated associations in regression models. Political Science Research and Methods, 8(1), 1-13.

Pearl, Judea. "On The Foundation Of Structural Equation Models or When Can We Give Causal Interpretation To Structural Coefficients?." (2007). Link.

My proposal

Two things.

One, going forward, each of us should formally document why we believe the things we believe, if these beliefs are going to be the basis of multiple hours of our time / work.

Two, I think we should consider switching the focus of @bouzaghrane's computational work going forward.

Right now, we have no successful application of causal inference techniques to a travel demand application to show.

It’s of course okay to speak only of difficulties in our presentation, but I think we’ll all feel much better if we can speak of some successes. To that end, we need to successfully apply some causal inference technique to our dataset and problem at hand.

One fact of our work so far is that we haven’t come up with a causal graph that we have any faith in or would feel comfortable defending.

To remedy this and gain a win for our presentation, I propose that @bouzaghrane use basic principles and simple techniques from the causal discovery literature to build a causal graph whose testable implications are not forcefully refuted by our actual data. Think the use of conditional independence tests as shown in #37 and analogous marginal independence tests.

This would simply go in the third part of the presentation on how to check your causal graph.

The idea is that building a reasonable causal graph can be an iterative process of positing a given causal graph, checking / falsifying that graph, and revising accordingly.

Let me know what you all think of the following plan.

First, I propose @bouzaghrane finish delivering on the Vision and high level simulation goals by showing what happens to a causal effect when using an independent data generating process vs using a causal graph with some explanatory variables. This can be prototype-complete in a week I believe. It's essentially the same as the CE264 forecasting homework. Specifically, this task is comparing average causal effect estimates without simulating the "downstream" effects of a change in travel distance vs average causal effect estimates with simulation of such downstream effects (i.e. on travel cost and travel time). This will demonstrate how a well-specified outcome model is not sufficient by itself for estimating causal effects.

Second, I propose @bouzaghrane use basic causal discovery techniques (I'll specify more details and rationale on a different issue) to build a causal graph whose testable implications are not badly refuted by our observed data. We can then use this causal graph to re-run the simulation above, and we can use the examples from the graph building process to demonstrate how one falsify's causal graphs in a third part of the presentation.

@bouzaghrane, do you want to do this or something like it, and do you think its feasible in 2 weeks (with my and Hassan's help of course)?

bouzaghrane commented 4 years ago

I'm up for it for sure @timothyb0912 .

I need to read again and think of issues I might run into. But I will put in the time to get it done.

timothyb0912 commented 4 years ago

@bouzaghrane and @hassanobeid1994 forget about my recommendation above of

Second, I propose @bouzaghrane use basic causal discovery techniques (I'll specify more details and rationale on a different issue) to build a causal graph whose testable implications are not badly refuted by our observed data. We can then use this causal graph to re-run the simulation above, and we can use the examples from the graph building process to demonstrate how one falsify's causal graphs in a third part of the presentation.

@bouzaghrane, do you want to do this or something like it, and do you think its feasible in 2 weeks (with my and Hassan's help of course)?

There is definitely no time for all of that.

What we can do though is show how to falsify one's causal graph.

Building a causal graph can be seen as an iterative process of proposing a causal graph then attempting to falsify it, then proceeding until one gets one or more causal graphs that are not horrendously refuted by one's observed data.

We have code showing how to do some basic conditional independence tests (#37 ) and marginal independence tests (#58 ). These are directly useful only with observed variables.

A final missing piece is how to falsify one's generic causal graph that contains latent variables. I have a work-in-progress pull-request for how to do this in #67 . The idea is to have a prototype of this completed by April 4th.

I'll likely need some help on the pull-request from @hassanobeid1994 for generating prior predictive samples from the probabilistic PCA version of the de-confounder.

Let me know what you all think of this revised plan.