Closed djolear closed 1 year ago
@djolear Hi, would you mind sharing the code that you used to simulate the problem you had but using the the Walkthrough data? While your explanation is clear, the devil lies in the details with code
@djolear I'm afraid I can't reproduce the error that you had with your dataset, as a consequence, I will close this issue but please re-open it if you manage to create a dataset we can use to test out the error.
Regarding your questions
Aggregate DMAs to obtain a higher volume of conversions per day. What does DMA stand for? From the text I suppose it is 'locations', and if so I don't think it will help out aggregating locations. Grouping locations is equivalent to getting the weighted average of the coefficients when training with the locations separated. The only situation where I can imagine that it would be useful is
Aggregating dates dates will again just contribute in the same situations as the two explained above Same answer as above
_The EffectSize and the AverageMDE diverge substantially in the GeoLiftMarketSelection Yes, that shouldn't happen. If this happens, then the algorithm is not performing well for your data, which likely means that some of the assumptions for the correct estimation of your lift aren't satisfied
_When the abs_lift_inzero is significantly above 0 If this happens then again your data is not satisfying at least one of the assumptions necessary when using Augmented Synthetic Control. It means that even when we force a lift of 0, we still detect a lift which 'feels above' 0, which can lead to a false positive situation where we state the existence of an impact by the treatment even when there is none
Hi folks,
I'm wondering if you might be able to help me out with a puzzling issue that I'm running into using the GeoLift package.
I'm currently designing a test using this package. Based on the results of the
GeoLiftMarketSelection
, I'm creating a simulated dataset, where I take the effective size provided in the function output and apply it to the test markets for the duration provided by the function output. For example, if the first market selection is for markets 1 & 2, with an effect size of 0.05 for a 15 day test, I apply a lift of 0.05 to my test markets data to simulate this lift.When I run the
GeoLift
function on this test data, the reported percent lift for these tests is often much higher than the lift that I simulated. For example, when I simulate a lift of 5%, I might get back a lift of 14.8%. I'm wondering if you might have any ideas as to why this is happening?I unfortunately can't share my data, but I did notice that when I apply this same procedure to the datasets provided in the GeoLift walkthrough, I don't run into the same problem. In other words, the lift that I simulate is more or less the one that is returned by the
GeoLift
function.Here are a few things that I've noticed or that I want to note about the dataset that I'm using:
GeoLiftMarketSelection
function, the Average_MDE is often higher than the EffectSize. For example, the EffectSize might be 0.04 but the Average_MDE is close to .08.I'm wondering if you have any ideas about why I might be obtaining these results and whether there is anything you recommend doing to obtain more robust results?
Here are some things I've thought of and I'm wondering if you think they might be valid approaches (but would be curious to know if you have other ideas):
Related to the above, I'm wondering how much cause for concern there should be when
GeoLiftMarketSelection
Thanks for any help here!