Question Code & Result - Githubissues

sjioisjcdsc commented 4 years ago

Hi I have a question regarding to the Code of the Package. I have tried to implement the Hillstrom (RCT) data set in CausalLift, but somehow the results look a bit weird. I splitted the dataset into a test and trainingsset and applied it as it is in the notebook example, but with cl = CausalLift(df_train,df_test, verbose=3,enable_ipw=False) After that I went to step 2: Somehow I got these results

What could be the problem here ? Why is the accuracy at 1 here, what could be the problem? And the pred/obs CVR in the end (result of step 3) is also at 1. Since I also compared the notebook example with my dataset I can't see the mistake. The dataset is balanced, so that about 50% got treated and 50% not.

Thx in advance!

Minyus commented 4 years ago

Hi @sjioisjcdsc ,

Do you use the RCT data at https://blog.minethatdata.com/2008/03/minethatdata-e-mail-analytics-and-data.html ? If so, the RCT data includes 3 outcome variables:

visit
conversion
spend

Could you ensure to exclude 2 out of the 3 outcome variables from the feature variables? If the problem persists, I'd appreciate it if you could share the reproducing steps including your code.

sjioisjcdsc commented 4 years ago

Hi Yes this is the right dataset. You are right, I forgot to exclude 2 of the outcome variables. I chose visit as the outcome variable now and tried again with the other two (conversion and spend) excluded from the feature variable set. Now the result look better but still a bit weird. My code looks as follows:

try: 
    import causallift
except:
   !pip3 install causallift

import causallift

from causallift import CausalLift
import pandas as pd

pd.options.display.max_rows = 8 
seed = 0

Then I preprocessed the data:

!pip install sklearn
from sklearn import preprocessing
from sklearn.preprocessing import LabelEncoder
df = pd.read_csv('http://www.minethatdata.com/Kevin_Hillstrom_MineThatData_E-MailAnalytics_DataMiningChallenge_2008.03.20.csv') 
womens_df = df[df.segment!='Mens E-Mail'].copy()
womens_df.segment.replace({'Womens E-Mail':1, 'No E-Mail':0}, inplace=True)
womens_df.drop(columns=['conversion', 'spend'], inplace=True)

hist_seg_map = {
    "1) $0 - $100":1,
    "2) $100 - $200":2,
    "3) $200 - $350":3,
    "4) $350 - $500":4,
    "5) $500 - $750":5,
    "6) $750 - $1,000":6,
    "7) $1,000 +":7
}

womens_df.history_segment.replace(hist_seg_map, inplace=True)
zip_code_le, channel_le = LabelEncoder(), LabelEncoder()
womens_df.zip_code = zip_code_le.fit_transform(womens_df.zip_code)
womens_df.channel = channel_le.fit_transform(womens_df.channel)
womens_df.rename(columns={'segment':'Treatment', 'visit':'Outcome'}, inplace=True)

random_seed=42
df_train,df_test = train_test_split(womens_df, test_size=0.3, random_state=seed, stratify=womens_df['Treatment'])

After that I applied the there steps

Step: cl = CausalLift(df_train,df_test, verbose=3,enable_ipw=False)

2.Step: train_df, test_df = cl.estimate_cate_by_2_models()

3.Step: estimated_effect_df = cl.estimate_recommendation_impact()

The outcome is better, but still why is the Accuracy for the trainingsset now smaller than one?

and for the pred/obs CVR it is also smaller than 1 and it must be bigger than 1 right?

Thanks a lot!

Minyus commented 4 years ago

Accuracy < 1 is ok. The problem is that roc_auc for testing data set should be above 0.5.

You might want to try feature engineering for the Hillstrom dataset. This document might be helpful: https://pbiecek.github.io/xai_stories/story-uplift-marketing1.html

sjioisjcdsc commented 4 years ago

Thanks for the reply, I had a look at the documentary and kicked some variables out and have been left with (newbie, recency, history, womens and zip_code).

The roc_auc of the test data is now > 0.5, but the pred/obs CVR is at about 0.003584 (test set).

How can I interpret the value pred/obs CVR now correctly? Does using uplift modeling now increase the conversion rate by 0.003 times on the test portion of simulated observational data? or is it a decrease? The value seem so small to me that I don't exactly know how to interpret it.

Minyus commented 4 years ago

It means the uplift modeling will decrease the conversion rate. roc_auc value near 0.5 means the supervised model is not good. I'd suggest you improve the models by feature engineering. A higher roc_auc value (e.g. 0.6 to 0.8) is ideal.

sjioisjcdsc commented 4 years ago

Thanks for your suggestion. Now I know where the problem lies. I'll check out different combinations based on the documentation!

Minyus / causallift

Question Code & Result #16