DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
We have a scenario where we want to assess the impact of two continuous treatments, T1 and T2 on outcome Y. We have some common causes X1, X2, and X3 all continuous. Target Y is also continuous. The need is to calculate the ATE with custom values of control and treatment values.
Questions we want to address:
Impact of T1 on Y
Impact of T2 on Y
Impact of T1 and T2 together on Y (as T1 and T2 might have some influence on each other).
What should be the causal graphs which can answer these questions.
For questions 1 and 2, I am assuming below graphs can answer these.
For question 3, I started with following graph.
I am looking to try below causal methods.
backdoor.linear_regression
backdoor.econml.dml.DML
backdoor.econml.dml.LinearDML
backdoor.econml.dml.KernelDML
I have got some results by using backdoor.linear_regression. But the results from using double ML (linear, DML) models do not make sense. Its giving outputs which is unrealistic. I am getting this warning while running double ML models. Not sure if I am specifying the values correctly in control_value, treatment_value?
A scalar was specified but there are multiple treatments; the same value will be used for each treatment. Consider specifyingall treatments, or using the const_marginal_effect method.
Below is the code which I tried for above causal structure to answer question 3. control_value_list and treatment_value_list contains the values for treatments T1, and T2 in the same order with which it was supplied while creating causal model object. e.g. control_value_list=[7,9] and treatment_value_list=[10,5]. Means for treatment T1, we want ATE with control value as 7 and treatment value as 10. And for treatment T2, we want control value as 9, and treatment value as 5.
model=CausalModel(
data = data,
treatment=['T1','T2'],
outcome='Y',
common_causes = ['X1','X2','X3']
)
Also, I am getting one ATE value from backdoor.linear_regression. But the output from backdoor.econml.dml.LinearDML are two separate values. Does the doubleML computing the ATE for 2 treatments separately? Also, I observed the code throws error when specifying confidence_intervals as True. Anything which can explain this?
Will the following causal structures answer question 3 better? Using one of the treatments as common cause along with rest of the other factors?
If we get ATEs from those 2 graphs above, can we add those and say that it addresses question 3? Or its not additive?
Are there any other recommendations to address question 3?
We have a scenario where we want to assess the impact of two continuous treatments, T1 and T2 on outcome Y. We have some common causes X1, X2, and X3 all continuous. Target Y is also continuous. The need is to calculate the ATE with custom values of control and treatment values.
Questions we want to address:
What should be the causal graphs which can answer these questions. For questions 1 and 2, I am assuming below graphs can answer these.
For question 3, I started with following graph.
I am looking to try below causal methods. backdoor.linear_regression backdoor.econml.dml.DML backdoor.econml.dml.LinearDML backdoor.econml.dml.KernelDML
I have got some results by using backdoor.linear_regression. But the results from using double ML (linear, DML) models do not make sense. Its giving outputs which is unrealistic. I am getting this warning while running double ML models. Not sure if I am specifying the values correctly in control_value, treatment_value?
Below is the code which I tried for above causal structure to answer question 3. control_value_list and treatment_value_list contains the values for treatments T1, and T2 in the same order with which it was supplied while creating causal model object. e.g. control_value_list=[7,9] and treatment_value_list=[10,5]. Means for treatment T1, we want ATE with control value as 7 and treatment value as 10. And for treatment T2, we want control value as 9, and treatment value as 5.
Also, I am getting one ATE value from backdoor.linear_regression. But the output from backdoor.econml.dml.LinearDML are two separate values. Does the doubleML computing the ATE for 2 treatments separately? Also, I observed the code throws error when specifying confidence_intervals as True. Anything which can explain this?
Will the following causal structures answer question 3 better? Using one of the treatments as common cause along with rest of the other factors?
If we get ATEs from those 2 graphs above, can we add those and say that it addresses question 3? Or its not additive? Are there any other recommendations to address question 3?