[NEW] Implement higher order markov chain models

DP6 / Marketing-Attribution-Models

Python Class created to address problems regarding Digital Marketing Attribution.

https://dp6.github.io/Marketing-Attribution-Models

Apache License 2.0

308 stars 81 forks source link

[NEW] Implement higher order markov chain models #45

Open taksqth opened 3 years ago

taksqth commented 3 years ago

The Markov Attribution implemented uses a first order markov chain model to compute the removal effects for each channel. This underlying model assumes that the probability of the next future state depends only in the current state and none of the previous. It is reasonable to state that this is not an appropriate assumption for the buyer journey and it would be useful to allow for higher order markov models.

I'm not sure if this is the best way, but one way to accomplish this is to expand the state space to include all the recent states up to the order r we're trying to model, introducing a new null state (not to be confused with the state representing non converting paths) to pad the start of the transition chain. The removal effect algorithm will need to be changed, since there will be multiple states associated with the original one we wanted to remove.

taksqth commented 3 years ago

There is research showing that order 4 worked better in some cases for attribution: https://www.researchgate.net/publication/322896486_Multichannel_Marketing_Attribution_Using_Markov_Chains

It should also be possible to implement a way to find the best fit automatically using any of the 3 methods here: https://medium.com/@ph_singer/order-estimation-for-markov-chain-models-6cde3ad2410b

taksqth commented 3 years ago

With this PR I've implemented the parameter order in the mam.attribution_markov method. Passing order=1 is equivalent to current behavior, but other values perform Markov attribution using higher order Markov models. I've also taken liberty in including 3 new parameters for defining the names of the auxiliary states (already present in the current version): start_state_name, conversion_state_name and null_state_name.

If this is merged, we can go further and implement a new parâmeter, tentatively called auto, which would allow the method to evaluate the best value for order in using the method described in the aforementioned articles. This should be a new issue.