facebookexperimental / Robyn

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.
https://facebookexperimental.github.io/Robyn/
MIT License
1.07k stars 323 forks source link

How does parallelization work in Robyn? #1010

Open AdimDrewnik opened 5 days ago

AdimDrewnik commented 5 days ago

For example, assume 10 trials, 1000 iterations, 2 cores. Will 5 trials with 1000 iterations be run on each core? I am not sure after analyzing source code. It seems that iterations within trial are split among cores, but this makes no sense as it would give less chance for convergence on shorter iterations on each core. I am on Windows machine. What is interesting is that I am getting consistently better models with higher R2 when selecting higher core number instead of core = 1, despite the fact that on Windows still only one core is used. This is mind boggling. Is Nevergrad behaving differently when using different number of cores with same number of trials and iterations?

Edit: After some additional digging in the source code it seems that Robyn uses Nevergrad num_workers argument for parallelization and assumes num_workers = 1 if cores = 1 but this seems wrong as num_workers is parameter of optimization routine and should be bigger than one even on single core machines. Nevergrad documentation https://facebookresearch.github.io/nevergrad/optimization.html discuses num_workers as an optimization parameter. "num_workers=5 with batch_mode=True will ask the optimizer for 5 points to evaluate, run the evaluations, then update the optimizer with the 5 function outputs, and repeat until the budget is all spent." So multicore Robyn runs even on machines with one core acting in sequential mode can get different/better results than when cores = 1 is set due to multiple points being evaluated before optimizer update.