py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
https://www.pywhy.org/dowhy
MIT License
6.91k stars 921 forks source link

Identify effect not showing backdoor variable #1048

Closed asha24choudhary closed 7 months ago

asha24choudhary commented 9 months ago

Hi there. I was referring to this issue. I am having a dataset with both observed and unobserved confounder as described below

Create the graph describing the causal structure

graph = """graph[directed 1 node[id "U" label "U"] node[id "X" label "X"] node[id "Y" label "Y"] node[id "Z" label "Z"] edge[source "U" target "X"] edge[source "X" target "Y"] edge[source "U" target "Y"] edge[source "Z" target "X"] edge[source "Z" target "Y"]]""".replace('\n', '')

Generate the data

U = np.random.randn(N_SAMPLES) Z = np.random.randn(N_SAMPLES) X = np.random.randn(N_SAMPLES) + 0.3U +0.2Z Y = 0.65X + 0.2U+ 0.3*Z

df = pd.DataFrame(np.vstack([Z,X, Y]).T, columns=['Z','X', 'Y']) print(df.head(10))

Create a model

model = CausalModel( data=df, treatment=['X'], outcome=['Y'], common_causes=['Z'], graph=graph ) model.view_model() plt.show()

I expect to have backdoor variable but

image

as you can seen the estimate says 'Backdoor identification failed'. I don't know what is wrong and how can I resolve this?

Could you please help me?

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 14 days with no activity.

amit-sharma commented 8 months ago

The error you are seeing is unrelated to the linked issue. In your case, the only valid backdoor set is $[U, Z]$, but since U is unobserved, identify_effect method returns that backdoor identification is not possible.

Note that graph argument takes precedence in CausalModel. So if you only want to condition on Z, you have can do so if by initializng CausalModel directly, without the graph.

model = CausalModel(
data=df,
treatment=['X'],
outcome=['Y'],
common_causes=['Z']
)
asha24choudhary commented 8 months ago

Thank you for your reply @amit-sharma, I reason why i linked the previous issue is because I wanted to include unobserved confounder. But don't you think I should include the graph which contains the info about the unobserved confounder 'U', which is also done in the issue I linked?

I was assuming that in order to have unobserved confounder, I should include it in the graph which is used while creating the model and exclude it in the dataset.

Yes if I exclude the graph while modelling, then the valid backdoor path includes Z. However, my question to u now is that should I not include the graph & why, because don't you think if I do so then I lose the info about the unobserved confounder in the model, of course it is still present in the data?

Would be really helpful if you could explain a bit more in detail.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 14 days with no activity.

github-actions[bot] commented 7 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.