Minyus / causallift

CausalLift: Python package for causality-based Uplift Modeling in real-world business
https://causallift.readthedocs.io/
Other
338 stars 42 forks source link

CATE vs Propensity #6

Closed soliverc closed 4 years ago

soliverc commented 4 years ago

I am just following the tutorial, and I have generated two columns: CATE and Propensity. The tutorial recommends selecting users with a high uplift score which is CATE.

Is the Propensity column any use to us at all? Or can I just disregard it? The propensity may be a positive number, and CATE could be negative some times. I'm not sure how to interpret the scores when this happens.

Minyus commented 4 years ago

Propensity score is used to compute CATE (uplift score). The range of propensity score, which is estimated probability to be treated, is between 0 and +1. CATE is difference of probability values, so the range is between -1 and +1.

You can find explanation at: https://github.com/Minyus/causallift/blob/develop/README.md

Jami1141 commented 4 years ago

Is it necessary to calculate propensity? I have an A/B test, therefore, I know which samples are treated and which are not. Later I plan to use Causallift model for later predictions on new data. If I do not need propensity for now since I use A/B test, do I need it for prediction? May I ask you to explain what is this propensity for and what does it mean?

Thanks

Minyus commented 4 years ago

For A/B test (RCT) data, propensity score estimation is not needed, so you can set enable_ipw False.

CausalLift(train_df, test_df, enable_ipw=False)

For observational data (data not from A/B Test or RCT), treatment should have been chosen based on a different probability (propensity score) for each sample, so IPW (Inverse Probability Weighting) using propensity score can be used optionally.

CausalLift(train_df, test_df, enable_ipw=True)

Jami1141 commented 4 years ago

Thanks for your respond. So it means when I use A/B test data for training the model, I do not need to have propensity but later when I want to use mode for later prediction I have to put enable_ipw= True That is true?

Minyus commented 4 years ago

That is true?

No.

I added explanation in the following sections in README.md.

https://github.com/Minyus/causallift#how-causallift-works https://github.com/Minyus/causallift#how-to-run-inferrence-prediction-of-cate-for-new-data-with-treatment-and-outcome-unknown

enable_ipw flag is used only during training.