py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
https://www.pywhy.org/dowhy
MIT License
7.12k stars 934 forks source link

Are there alternatives to backdoor.linear_regression for multiple continuous treatment estimation? #517

Open AlexRossiATF opened 2 years ago

AlexRossiATF commented 2 years ago

Hi everyone, I am trying to compute effect estimates for multiple continuous treatments, something like this:

estimate_t1 = model.estimate_effect(identified_estimand, method_name="backdoor.linear_regression", control_value=(0,0,0,0), treatment_value = (1,0,0,0))

estimate_t2 = model.estimate_effect(identified_estimand, method_name="backdoor.linear_regression", control_value=(0,0,0,0), treatment_value = (0,1,0,0))

At the moment I only managed to use the linear regression because the others I tried all give errors, are there other possible ones for this type of task?

amit-sharma commented 2 years ago

you can use the methods from econml. I would recommend double/debiased ML (DML) that works with continuous treatments. You can look at an example here: https://py-why.github.io/dowhy/example_notebooks/tutorial-causalinference-machinelearning-using-dowhy-econml.html

AlexRossiATF commented 2 years ago

I tried dml from the example:

estimate = model.estimate_effect(identified_estimand, method_name="backdoor.econml.dml.DML", method_params={ 'init_params': {'model_y':GradientBoostingRegressor(), 'model_t': GradientBoostingRegressor(), 'model_final':LassoCV(fit_intercept=False), }, 'fit_params': {} })

and I get the error below, seems like it doesn't support multiple treatments:

image

amit-sharma commented 2 years ago

you can refer to this table: https://econml.azurewebsites.net/spec/comparison.html

It should support multiple treatments. If you still face an error, can you post a minimum working example to debug?

AlexRossiATF commented 2 years ago

Thanks for your help. I tried these methods but I always have errors, these are 2 working examples with DML and CausalForestDML, with the first I have a dimension error and with the second an error regarding X=None: Graph: image

treatment = ['Lockdown', 'COVID_Stringency_Index_14', 'Weekly', 'Holidays']

model = CausalModel(data = df, treatment = treatment, outcome = 'Offline_Sales', graph = "DAG_temp.dot", target_units = 'ate', evaluate_effect_strength = True)

identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)

1st Example DML estimate = model.estimate_effect(identified_estimand, method_name="backdoor.econml.dml.DML", control_value = (0,0,0,0), treatment_value = (1,0,0,0), method_params={"init_params":{'model_y':GradientBoostingRegressor(), 'model_t': GradientBoostingRegressor(), "model_final":LassoCV(fit_intercept=True), 'featurizer':PolynomialFeatures(degree=2, include_bias=True) }, "fit_params":{}})

Error: `--------------------------------------------------------------------------- ValueError Traceback (most recent call last) /tmp/ipykernel_17291/2072210037.py in 17 'featurizer':PolynomialFeatures(degree=2, include_bias=True) 18 }, ---> 19 "fit_params":{}})

~/pywhy/dowhy/dowhy/causal_model.py in estimate_effect(self, identified_estimand, method_name, control_value, treatment_value, test_significance, evaluate_effect_strength, confidence_intervals, target_units, effect_modifiers, fit_estimator, method_params) 314 target_units) 315 --> 316 estimate = self.causal_estimator.estimate_effect() 317 # Store parameters inside estimate object for refutation methods 318 # TODO: This add_params needs to move to the estimator class

~/pywhy/dowhy/dowhy/causal_estimator.py in estimate_effect(self) 189 :returns: A CausalEstimate instance that contains point estimates of average and conditional effects. Based on the parameters provided, it optionally includes confidence intervals, standard errors,statistical significance and other statistical parameters. 190 """ --> 191 est = self._estimate_effect() 192 est.add_estimator(self) 193

~/pywhy/dowhy/dowhy/causal_estimators/econml.py in _estimate_effect(self) 115 if self.method_params["fit_params"] is not False: 116 self.estimator.fit(estimator_data_args, --> 117 self.method_params["fit_params"]) 118 119 X_test = X

~/notebooks/my-env/lib/python3.7/site-packages/econml/dml/dml.py in fit(self, Y, T, X, W, sample_weight, freq_weight, sample_var, groups, cache_values, inference) 504 sample_var=sample_var, groups=groups, 505 cache_values=cache_values, --> 506 inference=inference) 507 508 def refit_final(self, *, inference='auto'):

~/notebooks/my-env/lib/python3.7/site-packages/econml/dml/_rlearner.py in fit(self, Y, T, X, W, sample_weight, freq_weight, sample_var, groups, cache_values, inference) 369 sample_weight=sample_weight, freq_weight=freq_weight, sample_var=sample_var, groups=groups, 370 cache_values=cache_values, --> 371 inference=inference) 372 373 def score(self, Y, T, X=None, W=None, sample_weight=None):

~/notebooks/my-env/lib/python3.7/site-packages/econml/_cate_estimator.py in call(self, Y, T, inference, *args, kwargs) 128 inference.prefit(self, Y, T, *args, *kwargs) 129 # call the wrapped fit method --> 130 m(self, Y, T, args, kwargs) 131 self._postfit(Y, T, *args, **kwargs) 132 if inference is not None:

~/notebooks/my-env/lib/python3.7/site-packages/econml/_ortho_learner.py in fit(self, Y, T, X, W, Z, sample_weight, freq_weight, sample_var, groups, cache_values, inference, only_final, check_input) 635 for idx in range(self.mc_iters or 1): 636 nuisances, fitted_models, new_inds, scores = self._fit_nuisances( --> 637 Y, T, X, W, Z, sample_weight=sample_weight_nuisances, groups=groups) 638 all_nuisances.append(nuisances) 639 self._models_nuisance.append(fitted_models)

~/notebooks/my-env/lib/python3.7/site-packages/econml/_ortho_learner.py in _fit_nuisances(self, Y, T, X, W, Z, sample_weight, groups) 766 nuisances, fitted_models, fitted_inds, scores = _crossfit(self._ortho_learner_model_nuisance, folds, 767 Y, T, X=X, W=W, Z=Z, --> 768 sample_weight=sample_weight, groups=groups) 769 return nuisances, fitted_models, fitted_inds, scores 770

~/notebooks/my-env/lib/python3.7/site-packages/econml/_ortho_learner.py in _crossfit(model, folds, *args, kwargs) 166 kwargs_test = {key: var[test_idxs] for key, var in kwargs.items()} 167 --> 168 model_list[idx].fit(*args_train, *kwargs_train) 169 170 nuisance_temp = model_list[idx].predict(args_test, kwargs_test)

~/notebooks/my-env/lib/python3.7/site-packages/econml/dml/_rlearner.py in fit(self, Y, T, X, W, Z, sample_weight, groups) 49 def fit(self, Y, T, X=None, W=None, Z=None, sample_weight=None, groups=None): 50 assert Z is None, "Cannot accept instrument!" ---> 51 self._model_t.fit(X, W, T, filter_none_kwargs(sample_weight=sample_weight, groups=groups)) 52 self._model_y.fit(X, W, Y, filter_none_kwargs(sample_weight=sample_weight, groups=groups)) 53 return self

~/notebooks/my-env/lib/python3.7/site-packages/econml/dml/dml.py in fit(self, X, W, Target, sample_weight, groups) 73 sample_weight=sample_weight) 74 else: ---> 75 fit_with_groups(self._model, self._combine(X, W, Target.shape[0]), Target, groups=groups) 76 return self 77

~/notebooks/my-env/lib/python3.7/site-packages/econml/utilities.py in fit_with_groups(model, X, y, groups, kwargs) 897 model.cv = old_cv 898 --> 899 return model.fit(X, y, kwargs) 900 901

~/notebooks/my-env/lib/python3.7/site-packages/sklearn/ensemble/_gb.py in fit(self, X, y, sample_weight, monitor) 418 sample_weight = _check_sample_weight(sample_weight, X) 419 --> 420 y = column_or_1d(y, warn=True) 421 422 if is_classifier(self):

~/notebooks/my-env/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, *kwargs) 61 extra_args = len(args) - len(all_args) 62 if extra_args <= 0: ---> 63 return f(args, **kwargs) 64 65 # extra_args > 0

~/notebooks/my-env/lib/python3.7/site-packages/sklearn/utils/validation.py in column_or_1d(y, warn) 921 raise ValueError( 922 "y should be a 1d array, " --> 923 "got an array of shape {} instead.".format(shape)) 924 925

ValueError: y should be a 1d array, got an array of shape (208, 4) instead.`

2nd Example CausalForestDML estimate = model.estimate_effect(identified_estimand, method_name="backdoor.econml.dml.CausalForestDML", control_value = (0,0,0,0), treatment_value = (1,0,0,0), method_params={"init_params":{}, "fit_params":{}})


ValueError Traceback (most recent call last) /tmp/ipykernel_17291/431474604.py in 17 #'featurizer':PolynomialFeatures(degree=2, include_bias=True) 18 }, ---> 19 "fit_params":{}})

~/pywhy/dowhy/dowhy/causal_model.py in estimate_effect(self, identified_estimand, method_name, control_value, treatment_value, test_significance, evaluate_effect_strength, confidence_intervals, target_units, effect_modifiers, fit_estimator, method_params) 314 target_units) 315 --> 316 estimate = self.causal_estimator.estimate_effect() 317 # Store parameters inside estimate object for refutation methods 318 # TODO: This add_params needs to move to the estimator class

~/pywhy/dowhy/dowhy/causal_estimator.py in estimate_effect(self) 189 :returns: A CausalEstimate instance that contains point estimates of average and conditional effects. Based on the parameters provided, it optionally includes confidence intervals, standard errors,statistical significance and other statistical parameters. 190 """ --> 191 est = self._estimate_effect() 192 est.add_estimator(self) 193

~/pywhy/dowhy/dowhy/causal_estimators/econml.py in _estimate_effect(self) 115 if self.method_params["fit_params"] is not False: 116 self.estimator.fit(estimator_data_args, --> 117 self.method_params["fit_params"]) 118 119 X_test = X

~/notebooks/my-env/lib/python3.7/site-packages/econml/dml/causal_forest.py in fit(self, Y, T, X, W, sample_weight, groups, cache_values, inference) 738 """ 739 if X is None: --> 740 raise ValueError("This estimator does not support X=None!") 741 return super().fit(Y, T, X=X, W=W, 742 sample_weight=sample_weight, groups=groups,

ValueError: This estimator does not support X=None!

SaiAditya2595 commented 2 years ago

you can use the methods from econml. I would recommend double/debiased ML (DML) that works with continuous treatments. You can look at an example here: https://py-why.github.io/dowhy/example_notebooks/tutorial-causalinference-machinelearning-using-dowhy-econml.html

I am not able to find this page. Can you please check and update it? Thanks Sai Aditya

amit-sharma commented 2 years ago

The updated link is https://py-why.github.io/dowhy/v0.8/example_notebooks/tutorial-causalinference-machinelearning-using-dowhy-econml.html