py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.73k stars 706 forks source link

How to use known treatment probabilities in doubly robust learners #730

Open kyleco opened 1 year ago

kyleco commented 1 year ago

@kbattocchi

Hi Keith,

How would you recommend handling a case where we know the true treatment probabilities? I'd prefer to use them to avoid having to fit the model_propensity (in a doubly robust model, say ForestDRLearner).

A few options:

  1. Pass the (inverse) probabilities as sample_weight to fit. But then we need to choose something for model_propensity, perhaps just a dummy classifier?
  2. Create a trivial model_propensity that takes the probability as a feature and returns the same probability. But then we need some workaround to prevent the model_regression from using the probability as a feature (by DRLearner will always pass X, W to both model_propensity and model_regression). Maybe we can use a sklearn pipeline with transformer for this.

Thanks! Kyle

kbattocchi commented 1 year ago

If the probabilities are the same for every instance, then I'd just use sklearn.dummy.DummyClassifier(), which uses the 'prior' strategy by default and thus will output the empirical probability as the result of predict_proba.

If the probabilities vary but are known, I think adding them as the last column of W and then using make_column_transformer(('passthrough', -1)) as your propensity model should be fine, even without doing anything to filter that column from the regression model's input (without giving it a ton of thought I can't see how knowing the true probability should bias the regression). But if you really want to, then it should also be possible to use a column transformer that drops the column instead of passing it through and pipelining that transformer with your real model if you want (but note that we concatenate X, W, and one-hotted-first-column-removed T as inputs to the regression model so the column to drop is no longer the very last one).

kyleco commented 1 year ago

This is for varying probabilities. Using make_column_transformer and W is helpful (to avoid having the probabilities in the CATE model, which doesn't make sense).

nmwitzig commented 8 months ago

@kyleco @kbattocchi Sounds exactly like the problem I'm having; do you have a code snippet available where you implemented the make_column_transformer(('passthrough', -1)) workaround? I want to include varying, but known treatment probabilities into a ForestDRLearner; thank you!