py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
https://www.pywhy.org/dowhy
MIT License
6.99k stars 923 forks source link

Refutation Test with Treatment Classes #1009

Closed titubs closed 12 months ago

titubs commented 1 year ago

@amit-sharma

Hi Amit, I wanted to ask you if the Supported refutation methods in your library support Treatments which are classes. Example: 1 = treatment A 2 =treatment B

I used it in the following context:

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LassoCV
from sklearn.ensemble import GradientBoostingRegressor

dml_estimate = model.estimate_effect(identified_estimand, 
                                     method_name="backdoor.econml.dml.DML",
                                     #control_value = 0,
                                     #treatment_value = 1,
                                     target_units = lambda df: df["X0"]>1,  # condition used for CATE
                                     #confidence_intervals=False,
                                     method_params={"init_params":{'model_y':GradientBoostingRegressor(),
                                                                   'model_t': GradientBoostingRegressor(),
                                                                   "model_final":LassoCV(fit_intercept=False), 
                                                                   'featurizer':PolynomialFeatures(degree=1, include_bias=False)},
                                                    "fit_params":{}})

res_random=model.refute_estimate(identified_estimand, dml_estimate, method_name="placebo_treatment_refuter",placebo_type="permute")
print(res_random)

And the result I get its this: Refute: Use a Placebo Treatment Estimated effect:0.0024757317973681447 New effect:-1.9585700110526786e-05 p value:0.91

Am I interpreting this correctly that this test failed because after adding a placebo effect in the treatment, the causal estimate is sign. different from the true estimate which is not expected? I wonder if this interpretation given that my treatment are classes would be correct. Can you clarify?

PS: I ran the same dataset for:

Bootstrap Validation
Data Subsets Validation
Add Random Common Cause

and I am getting P-values > 0.05 which tells me the model passed. Again, given that my treatment are classes, could this be actually an incorrect interpretation?

amit-sharma commented 1 year ago

Hey @titubs Under placebo treatment, the treatment values in the dataset are randomized. So the expected effect value from a good estimator is 0, and your method also returns that. So the test has failed to refute your method (also evident from the p-value >0.05). In other words, the test is unable to invalidate your method. To interpret the refuters, you can refer to DoWhy's user guide's section on refutations

Two treatment variables: from your code, it is not clear which causal effect are you estimating. Do you have the values of trt A and B in the same column? Or as different columns? Sharing a reproducible example will help.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 14 days with no activity.

github-actions[bot] commented 12 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.