mozilla / mozanalysis

A library for Mozilla experiments analysis
https://mozilla.github.io/mozanalysis/
Mozilla Public License 2.0
10 stars 13 forks source link

Improve the performance of bayesian bootstrap by 10x #177

Closed jrmuizel closed 1 year ago

jrmuizel commented 1 year ago

The vast majority of the current time is in: np.random.RandomState(unique_seed)

Previously, when generation of the sample weights was parallelized it made sense to reseed for each sampling so that it was deterministic. However, that parallelization was removed in e33914d7fa42bfa1f305d14ee7b96c3a39643abb

With that gone we can drop the reseeding entirely which makes 16 or so 100 sample calls to compare_branches go from 50seconds down to 5.

scholtzan commented 1 year ago

cc @danielkberry: you are probably most familiar with this

danielkberry commented 1 year ago

Hi there, thanks for making this PR. I tested the change under conditions like we might encounter in normal use of this code. This test simulated an experiment and measured the timing of compare_branches under both the old code and the proposed change here (as described in the 1st comment).

As we can see from the timings in the notebook, the timings are not statistically different from each other, so this method does not appear to provide any speedup.

jrmuizel commented 1 year ago

Your notebook doesn't actually use the new code. If I fix that with:

mozanalysis.bayesian_stats.bayesian_bootstrap.get_bootstrap_samples = get_bootstrap_samples
mozanalysis.bayesian_stats.bayesian_bootstrap._resample_and_agg_once = _resample_and_agg_once

I get:

Before:

%%timeit
compare_branches(data, 'metric_a')
11.9 s ± 86 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
compare_branches(data, 'metric_b')
5.78 s ± 2.11 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

%%timeit
compare_branches(data, 'metric_a')
6.53 s ± 599 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
compare_branches(data, 'metric_b')
358 ms ± 84.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Here's an updated notebook: https://colab.research.google.com/drive/1p5Zl6vnmwVG8daoF4LTrNHF844zqFcCG?usp=sharing

danielkberry commented 1 year ago

Thanks for clearing that up. I did some comparisons of the new method to make sure nothing was obviously different from the previous one (as an end-to-end test) and it checks out. Graphs in this notebook: https://colab.research.google.com/drive/1eKfa_nGBzKduzN_5fDSmjbVt0b3TcL-q