facebookincubator / GeoLift

GeoLift is an end-to-end geo-experimental methodology based on Synthetic Control Methods used to measure the true incremental effect (Lift) of ad campaign.
https://facebookincubator.github.io/GeoLift/
MIT License
175 stars 54 forks source link

GeoMarketSelection and GeoLiftPower give different results for power - market_id =1 #147

Closed Snowcatcat closed 1 year ago

Snowcatcat commented 1 year ago

Hi @ArturoEsquerra and other contributors,

My question has not been answered, and I don't know how to reopen the ticket. I'm still confused why the 2 functions, GeoLiftMarketSelection and GeoPowerLift give different results. I am hoping you can take a look at my question again as I would like to make sure I'm not doing anything fundamentally wrong.

Following the sample code for GeoLiftMarketSelection in the walkthrough, we have "chicago, cincinnati, houston, portland" being 1 of the top 2 test market selections. The respective effect size for this selection is 0.05. (please note it is market_id =1, NOT market_id=2)

The code and results are as follows:

image

image

Nothing shown above is different from what we have in the walkthrough. When market_id =1, the EffectSize is 0.05, AvgScaledL2Imbalance is 0.1971864, and the Average_MDE is 0.04829913.

We can also verify the effect size by plotting the results:

image

setting the print_summary = TRUE, we also have: image

###################Now I will use the GeoPowerLift function to run the same analysis for market_id =1################ image

Looking at the output table, we can see that the effect size is now 0.1, and the L2 Imbalance is 0.259691.

image

image

Can you PLEASE help me understand why the two methods give totally different answers??

Snowcatcat commented 1 year ago

I also tried GeoPowerLift for other test market selections (market_id = 2, 3,4, etc.), and the effect size. investment, L2 imbalanced numbers are all different from what we obtain through GeoMarketSelection() for the same set of inputs.

ArturoEsquerra commented 1 year ago

Hi again @Snowcatcat!

The reason why the issue was closed is because we thought that our reply answered your question. If you still have questions related to the same issue, please reply on that same issue so that we can re-open it. We kindly ask you to avoid opening new GitHub issues related to the same questions.

That being said, we aren't able to replicate your issue. Make sure that you have the latest version of GeoLift installed on your system and to remove all objects from the Environment that could be generating issues.

Replicating your example we get:

MarketSelections <- GeoLiftMarketSelection(data = GeoTestData_PreTest,
                                           treatment_periods = c(10,15),
                                           N = c(2,3,4,5),
                                           Y_id = "Y",
                                           location_id = "location",
                                           time_id = "time",
                                           effect_size = seq(0, 0.5, 0.05), 
                                           lookback_window = 1,
                                           include_markets = c("chicago"),
                                           exclude_markets = c("honolulu"),
                                           cpic = 7.5,
                                           budget = 100000,
                                           alpha = 0.1,
                                           Correlations = TRUE,
                                           fixed_effects = TRUE,
                                           side_of_test = "two_sided"
                                           )

image

Now, running the same market selection with GeoLiftPower() we get:

market_id = 1
market_row <- MarketSelections$BestMarkets %>% dplyr::filter(ID == market_id)
treatment_locations <- stringr::str_split(market_row$location, ", ")[[1]]
treatment_duration <- market_row$duration
lookback_window <- 1

power_data <- GeoLiftPower(
                            data = GeoTestData_PreTest,
                            locations = treatment_locations,
                            effect_size = seq(0,0.5,0.05),
                            lookback_window = lookback_window,
                            treatment_periods = treatment_duration,
                            cpic = 7.5,
                            side_of_test = "two_sided"
)
power_data

image

As you can see, they match perfectly. Both of these functions share the same engine, so the results should be consistent every time.

Snowcatcat commented 1 year ago

@ArturoEsquerra Thanks a lot for the help! I forgot the blank after the comma for this code: treatment_locations <- stringr::str_split(market_row$location, ", ")[[1]]