DP6 / Marketing-Attribution-Models

Python Class created to address problems regarding Digital Marketing Attribution.
https://dp6.github.io/Marketing-Attribution-Models
Apache License 2.0
302 stars 80 forks source link

Can the conversion rate for the Shapley value be artifically changed to let's say some ML predictions of conversion rate? #6

Closed trangdo42 closed 3 years ago

trangdo42 commented 4 years ago

Hi! Thank you for doing this, the package is great. However, I was wondering whether for the Shapley atribution the conversion rate can be changed from the empirical rate (sample mean conversion rate per path) to some predictions of conversion probability, let's say using a Logistic Regression or NN.

Thank you, Trang

rsennes commented 4 years ago

Hi @trangdo42. Thank for the feedback and to be using this package. Unfortunattely, at this moment it is not possible to make this changes in the conversion rate in the way that the tool was designed. Sounds like a good improvement to be made, but before we prioritize this issue can you share a little bit more of details of WHY you believe this change is necessary and HOW you visualize this improvement working at your end? It's important to us to understand the real benefits and how this can be applied for some real use cases.

Again, thank you very much for the feedback and to be using it. Only with the community's contribution we can improve it and make a a better package for everybody!

trangdo42 commented 4 years ago

Hi, I'm actually writing my master thesis on this. The basic framework of the Shapley attribution model is quite successful in terms of conversion prediction; however, this can be improved by using a predictive ML model. Shapley uses empirical conversion rates in the sample which means that these naive predictions taking the sample mean conversion rate can be inaccurate especially if the number of observations for some paths is scarce. Using a predictive model for the conversion probability for each path you not only generalise the model for the out-of-sample use but you can also account for more user-level information, and thus, you account for heterogeneity not only on the channel-level but also the user-level (e.g. number of page views, number of clicks, number of sessions, medium or platform used). I believe that this is more accurate because every customer's experience is different and is not based only on the channels in his/her path.

rsennes commented 4 years ago

Hi @trangdo42 , makes sense your argument. We didn't consider this approach while developing this tool but seems to be a significant improvement, and I wll make sure to put in our backlog. However, I have to say that we need to accomodate this improvement in our day-to-day activities and I cannot provide you a deadline to have it implemented in our end. By the way, we are very glad that you are considering to use this tool in a master thesis and we would love to hear more from your research about this topic. We are a Brazilian company very specialized in this topic and maybe we can support you somehow, by sharing a little of our experience around it or in a more structure way in one of our open innovation initiatives. What you think?

trangdo42 commented 4 years ago

Hi, sorry for the late reply. I understand that it takes a while to implement such an adjustment, especially in practice, it's always a bit different to work with online large data streams rather than when you're working offline with a local dataset. I've actually implemented the attribution on the NN, and Gradient Boosted Trees predictions, but it would still take some more energy to do this part properly (i focused mainly on the predictive models for the conversion rate probability). There's definitely still room for improvement. We can schedule a call, if you're interested and talk a bit more about it. Drop me an email, my address is thaotrangdo@gmail.com. I'm basically available anytime this week or next week from Mon to Wed.

Looking forward, Trang

rsennes commented 4 years ago

Hi @trangdo42 I just sent you a meeting invitation. We can discuss further.

See you Rafael