Clarification on implementation of double ML

j-chou commented 5 years ago

Hi, I have some questions regarding the project for implementing other causal estimation methods. https://github.com/microsoft/dowhy/projects/3#card-24479485

For double ML, did you have in mind a particular model like random forest for modeling the confounders or did you want to allow the user to choose models and hyperparameters? Is cross-validation for these models important?
Methods in econml seem to be aimed at estimating heterogeneous effects in the presence of high-dimensional complex confounders. Could you clarify what types of refutation/sensitivity methods you'd like to have for this particular case? I saw that there's a card about implementing methods for evaluating matching, but I'm not sure whether you had something else in mind for sensitivity analysis for methods like orthogonal random forest or double ML in the econml project.

Thanks!

amit-sharma commented 5 years ago

Thanks for starting this discussion @j-chou . These are important questions, I share my thoughts below.

For double ML, ideally we would like to allow the user to choose models and parameters. However, given that econml already implements double ML, it might make the most sense to integrate with econml and call their implementation from within DoWhy. On cross-validation, that's a great point. The standard double ML method does not talk about cross-validation for the ML models and I think that's a necessary step to be able to choose suitable predictors (although it does use cross-fitting for the final estimate for unbiasedness).
I can think of a few refutations that are relevant whenever a causal inference method (CI method) is conditioning on high-dimensional confounders: a) Identification: There is a tendency to include all known variables as confounders. What if one of the "confounders" is actually an instrument? This could be implemented by choosing a confounder variable and moving it in the causal graph to be an instrument, and then rerunning the CI method. It can be especially useful if the user has a "good guess" about which variables might be candidates for being an instrument. Interestingly, this could be useful even for refuting the average treatment effect (simply removing a confounder, opposite to adding another one).

b) Estimation: Many of these CI methods depend themselves on complex ML models. So perturbing the hyperparameters of these models or even simply resetting the random seed can be useful ways to check sensitivity of the estimates. Of course, we might want to find an efficient way of rerunning these CI methods because many of them can take a longer time to execute.

c) Another refutation could be through an independent average treatment effect estimator. Given a set of disjoint subsets on which heterogeneous treatments are estimated, their weighted combination can be used to derive an estimate for ATE. This estimate, ideally should match that from another ATE method.

d) In addition, for any given heterogeneous treatment effect, the refutation methods in DoWhy still apply. For example, we could consider the subgroup on which conditional ATE is estimated, then artificially make the treatment random for that subgroup of people (and thus zero conditional ATE) and then rerun the CI method for the conditional ATE.

These are some that I'm thinking about---I'm sure that there are others that could be also important. Would love to know if you have any ideas?

j-chou commented 5 years ago

Thanks for the response and sorry for the late reply!

Sounds good. I'll look into calling econml's double ML implementation.
I really like the general approach of perturbing the DAG for sensitivity analysis as you suggest. Perhaps we could implement a set of functions for basic graph operations for a start:

_confounder_toiv(variable) - takes a variable as input, converts it to an IV and returns the 2SLS estimate of the ATE
_addconfounder(variable_a, variable_b, coef_a, coef_b) - adds a linear confounder between variables and b with the specified strengths and returns a new regression estimate of the ATE
_addcollider(variable_a, variable_b, coef_a, coef_b) - adds a linear collider between variables a and b with specified strengths and returns a new regression estimate of the ATE

Are there other graph operations you think would be good to include?

amit-sharma commented 5 years ago

Thanks @j-chou. The three operations you suggest are great, and we should definitely try to add them to DoWhy. In addition, @emrekiciman and I have been discussing on the broader issue of conditional effects (e.g. CATE) and how to incorporate them in the current DoWhy API. We would like to support different CATE methods in addition to double-ml, but also ensure a simple API that works for all such methods.

I am preparing a document that lays down specs for a general (updated) API for DoWhy. I will include the three functions you suggest above, but also frame them in the general setup for CATE, ATE, ATT and how to refute them.

Would you like to help us in arriving at the correct specs for the API? One way is that I could start a Wiki article on Github that we all can comment on.

j-chou commented 5 years ago

@amit-sharma Would love to help out with the API. A Wiki article sounds great!

amit-sharma commented 5 years ago

@j-chou thanks for your patience. I have added a wiki roadmap here: https://github.com/microsoft/dowhy/wiki/Roadmap

Will really appreciate your feedback on it.

nsalas24 commented 5 years ago

First, thanks for open sourcing this package, I've learned a lot from it!

To add to the discussion regarding conditional effects- https://github.com/uber/causalml appears promising in terms of 1) implementing a variety of meta-learners for estimating heterogeneous treatment effects 2) flexibility in model parameterization. Perhaps inspiration can be drawn from it as well as EconML.

As mentioned, I think cross-validation is vital to any would be user of the meta-learner, and the authors of the R-learner implement it (https://github.com/xnie/rlearner/tree/master) in a small R package.

Something @amit-sharma brought up that I think would be a great refutation to add for these methods would be perturbing the meta-learner's hyper-parameters to measure the change in distribution of CATE's, change in ATE, etc. I don't see this in the project roadmap, is this a functionality worth adding?

amit-sharma commented 4 years ago

Thanks for your comment @nsalas24 and sorry for the super late reply--somehow missed responding here. I've just integrated the metalearners from econml into DoWhy so you can directly call a metalearner as shown here: https://github.com/microsoft/dowhy/blob/master/docs/source/example_notebooks/dowhy-conditional-treatment-effects.ipynb

Would be great to add refutations based on cross-validation or parameter perturbation? Will you be interested in contributing?

nsalas24 commented 4 years ago

Hey @amit-sharma,

On the surface, the integration with EconML looks great. It looks like they've introduced several new 'categories' of estimation methods and doWhy can now call upon them. So yes I can start working on building an new refutation method to test the consistency in CATEs of some of these more complex estimation methods. I think it makes most sense to start with a simple cross-validation or random-seed permutation, as some of these methods actually invoke several supervised models (e.g. XLearner) making the hyper-parameter space search a bit prohibitive.

amit-sharma commented 4 years ago

Sounds great @nsalas24 Yes, makes sense to start with simple cross-validation or permutation tests. Feel free to ping if you have any questions as you work on the refuter.

amit-sharma commented 3 years ago

Closing due to inactivity. @nsalas24 if you'd still like to contribute refuters for cate estimators, let me know.

py-why / dowhy

Clarification on implementation of double ML #68