facebookexperimental / Robyn

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.
https://facebookexperimental.github.io/Robyn/
MIT License
1.13k stars 336 forks source link

Refresh output is not matching the documentation #798

Open CJ2407 opened 1 year ago

CJ2407 commented 1 year ago

Hi @laresbernardo @gufengzhou - I have some questions about the Refresh functionality -

  1. My refresh output is showing all the variables compared between initial and refresh model instead of clubbing baseline and promo variables. Because of this, chart is unreadable for most media channels in terms of % decomp. image

  2. When I was iterating non-stop to find the best initial model, I didn't realize that the JSON file was not creating. So, once I found a good model that I our leadership liked, I had to re-run the model fixing certain hyperparameters to get the desired model. I want to know if this is the reason why most of my hyperparameters are fixed when the refresh is running (see comment highlighted in Yellow in the above screenshot)? Generally, when we run refresh, does the refresh model use exact same values of hyperparameters for each channel from the initial model? If so, how does model tweak delay or decay rate based on changing strategies or behaviors of channels?

Originally posted by @CJ2407 in https://github.com/facebookexperimental/Robyn/issues/507#issuecomment-1661308031

CJ2407 commented 1 year ago

A few more explanations to be able to use the outputs properly -

  1. report_alldecomp_matrix and pareto_alldecomp_matrix CSVs have different totals for the refresh solnID. Why? I see all my 396 rows of data for refresh is duplicated in the report_alldecomp_matrix CSV with the same solnID of the refresh.

  2. % decomp from report_aggregated CSV do not match with the report_decomposition plot for the initial model. Why? It looks like report_decomposition plot has % matching with pareto_aggregated from the initial model build.

  3. Initial model pareto_aggregated CSV has different % attribution vs initial model attribution generated in report_aggregated CSV of the Refresh model. Why? Also, what's the time period for which 3_60_31 results would have generated in a refresh run?Refresh model 1_15_10 results are for rolling 1185 days starting 3/3/2020. See below - image

  4. I am seeing some very big swings between initial model and refresh model (see in the screenshot above). When most of my hyperparameters are fixed, why would this be? How do I explain or validate them?

gufengzhou commented 1 year ago

Sorry for the late reply. The team is lagging in resource at the moment and refresh is quite a complicated feature to fix. Let me address the issues here:

  1. You can export your own plots by looking at the refresh object RobynRefresh$refresh$plots$pBarRF. We've also updated the package so that the background isn't transparent. Please update and try.
  2. Sorry to hear about the missing json. A refresh model will iterate its hyperparameters based on the previous model, which is the whole trick of refreshing: Keeping the balance of consistency vs. sensitivity regarding decomposition. For example, assuming modelling week 1-100 in the selected initial model, and channel A has theta = 0.2 within an original range of c(0.1, 0.4). Then assuming you're building refresh after 4 weeks --> refresh model will be built on week 5-104 (refresh_steps = 4), and theta for channel A will have a new range centering around 0.2 +/- (0.4-0.1) * (4 /100) /2, which results in range of c(0.1994, 0.2006). As you can see, we're allowing narrow ranges for each hyppars in the refresh to ensure consisstancy between refreshes. BUT when you're refreshing larger periods, then the refresh ranges become larger too, leading to more diversed results. It's a fine line between refresh and rebuild. Usually we encourage model refresh when the steps are small. For larger periods and/or major changes like new media channels/ new variables etc, we recommend rebuilding the model.
gufengzhou commented 1 year ago

To your questions 3-6 that are all about inconsisstancy across initial & refresh models, I just did a quick 4-step refresh test with the sample data. In the first screenshot, the refresh pareto_aggregated.csv on the right side (the recreated model) has slightly different coefficient than the original/initial csv.

image

I've looked into it and this is actually due to the rounding when exporting the json files. We can definitely fix that to allow more precise model recreation. But we're not off too much here, as you can see here below. image

In your case, the difference is of course very large and I assume it's due to missing hyperparameter values in your model recreation process: besides the "usual" media hyperparameters (theta/alpha/gamma/shape/scale), there's also lambda (and train_size if you use ts_validation). In the 3rd screenshot below, you can see my tested model json 1_134_6 that has the session hyper_values, incl. lambda and train_size (you can also see the 4-digit rounding here). You can find the selected lambda value in the pareto_hyperparameters.csvtoo. Both lambda and train_size have strong impact on beta coef estimation, so it's not very surprising that it came out differently. This means, if you also fix lambda/train_size, you should be having close to identical recreated models. But of course the best way is to always remember exporting the json :) image