py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
https://www.pywhy.org/dowhy
MIT License
7.01k stars 922 forks source link

Mediation analysis and (likely) IV bug #946

Open jcreinhold opened 1 year ago

jcreinhold commented 1 year ago

Describe the bug When the second-stage regression model is not provided and is being created, it looks like a piece of code overwrites the backdoor method setting and the backdoor variables which is being done directly above. I can't imagine this is anything other than a typo, but please correct me if I'm mistaken. Otherwise, I'm happy to provide more details as needed.

Steps to reproduce the behavior

  1. Create a causal model with nonparametric-nie as the estimand type, and mediation.two_stage_regression and the method.
  2. Run estimate_effect.
  3. See error in second-stage fitting. (See Additional context for the error message.)

Expected behavior The estimation to complete without errors.

Version information:

Additional context Error that's produced

.../dowhy/causal_model.py in estimate_effect(self, identified_estimand, method_name, control_value, treatment_value, test_significance, evaluate_effect_strength, confidence_intervals, target_units, effect_modifiers, fit_estimator, method_params)
    316                 self._estimator_cache[method_name] = causal_estimator
    317 
--> 318         return estimate_effect(
    319             self._data,
    320             self._treatment,
.../dowhy/causal_estimator.py in estimate_effect(data, treatment, outcome, identifier_name, estimator, control_value, treatment_value, target_units, effect_modifiers, fit_estimator, method_params)
    709 
    710     if fit_estimator:
--> 711         estimator.fit(
    712             data=data,
    713             treatment_name=treatment,
.../dowhy/causal_estimators/two_stage_regression_estimator.py in fit(self, data, treatment_name, outcome_name, effect_modifier_names, **_)
    225             self._second_stage_model._target_estimand.treatment_variable = parse_state(self._mediators_names)
    226 
--> 227         self._second_stage_model.fit(
    228             data,
    229             parse_state(self._second_stage_model._target_estimand.treatment_variable),
.../dowhy/causal_estimators/linear_regression_estimator.py in fit(self, data, treatment_name, outcome_name, effect_modifier_names)
     87                     methods support this currently.
     88         """
---> 89         return super().fit(data, treatment_name, outcome_name, effect_modifier_names=effect_modifier_names)
     90 
     91     def construct_symbolic_estimator(self, estimand):
.../dowhy/causal_estimators/regression_estimator.py in fit(self, data, treatment_name, outcome_name, effect_modifier_names)
     90         self._set_effect_modifiers(effect_modifier_names)
     91 
---> 92         self.logger.debug("Back-door variables used:" + ",".join(self._target_estimand.get_backdoor_variables()))
     93         self._observed_common_causes_names = self._target_estimand.get_backdoor_variables()
     94         if len(self._observed_common_causes_names) > 0:
.../dowhy/causal_identifier/identified_estimand.py in get_backdoor_variables(self, key)
     57                 return self.backdoor_variables[self.identifier_method]
     58             elif self.backdoor_variables is not None and len(self.backdoor_variables) > 0:
---> 59                 return self.backdoor_variables[self.default_backdoor_id]
     60             else:
     61                 return []
KeyError: None
jcreinhold commented 1 year ago

As a workaround, for anyone facing this same issue, setting the method_params (for estimate_effect) to the following seems to work and replicate the original, intended functionality:

from dowhy.causal_estimators.linear_regression_estimator import LinearRegressionEstimator

method_params = {"init_params": {"second_stage_model": LinearRegressionEstimator}}