py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
https://www.pywhy.org/dowhy
MIT License
6.88k stars 916 forks source link

Clarification on how to use gcm properly for confounders adjustment #1193

Closed chelsealee14 closed 1 week ago

chelsealee14 commented 1 month ago

Hi

Learning causal inference would appreciate any insights on this, thank you!!

I understand that a causal graph encodes relationships and tells us which variables to adjust for in closing the backdoor paths (if they can be). I'm working on a project where we'd like to use the gcm package and simulate interventions. We currently have a graph, with a known outcome, but we do not know our 'treatments' so we'd like to use the graph structure to inform us of possible upstream 'treatments' and run intervention analyses for each treatment.

gcm seems to use structural causal model and parametrizes the graph that best fits our data.

  1. How does it control for confounding?
  2. Do we need to 'cut' the graph so that it only has relevant variables (treatment, outcome, confounding variables as part of the adjustment set) and gcm would take care of confounding? If so, how does it know which are colliders/mediators vs forks and know to control for the latter, not the former?

We're following the functions here: https://www.pywhy.org/dowhy/v0.10.1/user_guide/causal_tasks/estimating_causal_effects/effect_estimation_with_gcm.html

bloebp commented 1 month ago

Hi,

When you use a GCM, it assumes that there are no hidden confounders (i.e., confounders that exist but are not part of the graph). In other words, we assume causal sufficiency. If the graph does not contain hidden confounders, then the way how the interventions are calculated implicitly takes care of confounders, adjustments, etc. Therefore, no need to further prepare the graph structure.

That being said, it has some slight robustness to hidden confounders as well if observed_data is given (e.g., if root nodes are confounded, then this can help).

chelsealee14 commented 1 month ago

Gotcha, thanks for the catch about the no hidden confounders assumption. That's quite big.

bloebp commented 1 month ago

On the topic of establishing causal sufficiency, what tips do you have for adequately checking if there residual confounding besides the sensitivity analysis checks by dowhy (placebo treatment refuter for example).

Maybe causal discovery literature can help here. E.g., the FCI algorithm uses conditional independence tests to conclude that certain variables have hidden confounders. There are also newer approaches, like CAM-UV (https://lingam.readthedocs.io/en/latest/tutorial/camuv.html), e.g., causal-learn has an implementation as well: https://causal-learn.readthedocs.io/en/latest/search_methods_index/Causal%20discovery%20methods%20based%20on%20constrained%20functional%20causal%20models/lingam.html#cam-uv

That being said, these algorithms are aiming at discovering causal structures. In your case, you can maybe use some of the ideas there to check if your structure does not have hidden confounders.

It's not too clear how the interventions implicitly take care of confounders, could you explain this part or provide any links? I currently interpret this method as parametrizing the graph and propagating down a interventional value, but I'm not seeing the connection between this and backdoor adjustment, especially when there could be colliders/mediators (which shouldn't be adjusted for)

The propagation of interventions allows you to sample from the interventional distribution. Averaging these samples for a target node is then equivalent to estimating the average causal effects directly using a single model with adjustment sets (assuming the graph is correct, of course). Adjustment sets are important if you only have one model of the effect (e.g., you only model the causal effect on a target node). However, in a structural causal model (SCM), you model the whole data generation process (i.e., one model for each node) which allows you to sample from interventional (and counterfactual) distributions.

For example, in a chain $X \rightarrow Y \rightarrow Z \leftarrow W$, you have two functional causal models: $Y := f_Y(X, N_Y)$ and $Z := f_Z(Y, W, NZ)$. Now, if you want to estimate the impact of $do(Y := y)$, you can simply propagate it through the FCMs. The crucial part is that the graph implies that $Y$ and $W$ are independent. To evaluate $E[Z|do(Y:=y)] = E{N_Z, W}[f_Z(y, W, N_Z)]$, you would take independent samples from the marginals $W$ and $N_Z$, which avoids introducing spurious relationships. You can also think of another example with additionally $Y$ -> $W$. In that case, you have an additional FCM $W := f_W(Y, N_W)$, where you then, instead of sampling from the marginal of $W$, you sample from the interventional distribution $p(W | do(Y:=y))$, which can be simulated by evaluating $f_W(y, N_W)$ with randomly sampled $NW$. In that case we then have $E[Z|do(Y:=y)] = E{N_Z, N_W}[f_Z(y, f_W(y, N_W), N_Z)]$ (replaced the $W$ in the FCM of $Z$ with $f_W$).

Section 6.3 in the book "Elements of Causal Inference" (https://mitp-content-server.mit.edu/books/content/sectbyfn?collid=books_pres_0&id=11283&fn=11283.pdf) can maybe give more insights. In the book they also have a proof that the propagation is equivalent for obtaining correct interventional (and counterfactual) distributions.

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 14 days with no activity.

github-actions[bot] commented 1 week ago

This issue was closed because it has been inactive for 7 days since being marked as stale.