py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.87k stars 719 forks source link

Will DRPolicyForest control for confounder in Treatments that occured under different seasons? #810

Open titubs opened 1 year ago

titubs commented 1 year ago

@kbattocchi Hi Keith, I wanted to understand how the DRPolicyForest would behave if I had 2 treatments in which one treatment is randomized data collected during December and the other treatment is randomized data collected during May.

If I were to specify confounder such as seasonal factors, would the DRPolicyForest be able to control for differences in seasons among the two treatments and make them more comparable?

For example:

We cannot compare $5 with $8 directly due to seasonal confounders.

Is there a way with the DRPolicyForest to show case how the adjusted figures for revenue per user would look like? I.e. User A's observed Treatment 1 generated $4 in revenue. However, User's A de-biased revenue under Treatment 1 would be $3 (after controlling for confounder)?

Is that possible as I am only able to get the counterfactual effect values (model.predict_value(X_test)) for all treatments BUT the observed one. Or is there anything using the counterfactual values to showcase this point?

kbattocchi commented 1 year ago

I'm afraid I'm not fully sure I understand what you're asking. It sounds like since you have (different) randomized treatments in each period, you can compute the treatment effect in each period, but that you worry that the treatment effect from one season can't be assumed to apply to the other. This seems like a very reasonable concern.

If you are willing to assume that "season" acts on the outcome via some other measured covariates (say, temperature and rainfall), then if you include those covariates in W or X (depending on whether you think the season affects only the outcome or possibly also the strength of the effect) and fit a model on the combined data then the model could take those factors into account. However, I would be concerned that there might not be enough in-season variation in these factors to extrapolate from one season to another, and also that "seasonal" factors might include a whole host of other things that you haven't measured (e.g. maybe there are more tourists in town at one time than another), so I wouldn't take any such estimates very seriously unless you're confident that you understand the mechanism by which the season affects the outcomes, that you have measured all necessary factors, and that those factors affect the outcome in a way which an ML model will be able to generalize from one season to the other.

titubs commented 1 year ago

@kbattocchi Thank you and yes, you understood this question correctly. To make sure I understood your reply, let me repeat it back:

Can you confirm/clarify above?

Furthermore, the issue with the unobserved confounders (3rd bullet) is "we dont know what we dont know", right? Is there any test that you recommend to check how "good" the adjusted treatment effect is (from the DR model) given that I have ground truth for both experiments (since those were randomized trials)? In other words, do you have any idea how we could leverage ground truth (for each variant) to check how good the model controls for confounder? Is this possible? I want to get a temperature if we can trust the model in cases where we train the policy on randomized data that came from difference periods/seasons.

kbattocchi commented 1 year ago

For your second bullet, imagine, for example, that ice cream demand is high when temperatures are above 50 degrees Fahrenheit and low when temperatures are below 50 degrees. Then if summer temperatures are always above 50 and winter temperatures are always below 50 then any attempt to extrapolate from one season to the other will fail. So you either need to ensure that there is overlap (the same situations occur in both seasons, but maybe with different frequencies), or that the functional relationship varies smoothly in a way that you can safely extrapolate from one season to the other.

For your last question, I don't think it's possible, both because of the extrapolation issue (your Y(treatment, weather) model can have zero error but still not generalize correctly out-of-domain), as well as the fact that if the Y model does not fit perfectly, there's no way to tell if this is due to harmless random noise or due to a missing confounder that would actually lead you to compute a different treatment effect if accounted for.