mrc-ide / individual

R Package for individual based epidemiological models
https://mrc-ide.github.io/individual
Other
30 stars 16 forks source link

shadow variable benchmark #154

Open slwu89 opened 2 years ago

slwu89 commented 2 years ago

@giovannic I have an implementation of the shadow integer variable here. The way it works is pretty straightforward, there are just 2 vectors it contains, each time step an enum is used to track which one is "active" and which is the "shadow". Queuing updates overwrites the shadow. The update sets the shadow to be the new active vector by swapping the enum and copies the old shadow to the new shadow vector.

The result of one of the benchmarks is here (the pattern is similar for the rest). As you can see, for large variables when only a few elements are being updated, the current implementation is faster. For larger variables when a high proportion of elements are being updated however, the shadow variable is faster, see limit (variable size) of 1e6 with updating 9e5 elements in the lower right, where the shadow variable is quite a bit faster.

vv_bi.pdf

So it's a tradeoff = ) Of course the double variable's shadow implementation should look almost identical. It may be less of a tradeoff for the double variable, which is probably updated much more frequently, with more of the population being updated.

slwu89 commented 2 years ago

For the variable reset type update, the 2 implementations seem to in a statistical tie. For variable fill, the shadow variable is slightly faster.

fill.pdf reset.pdf

slwu89 commented 2 years ago

assignment is faster than std::copy: https://github.com/slwu89/individual/commit/86ccb3415b65624ad2c32f7c74e430a88a975e7e

Focusing on variable fill method, the one for which the shadow variable was the slowest, we see why from profiling. The existing method just stores a single int, then uses std::fill to update the values. The shadow variable first uses std:fill to fill the shadow values with that int, then uses assignment/std::copy to swap active and shadow vector.

base.pdf shadow_assign.pdf

After changing from std::copy (fill) to assignment operator (fill_assign), the benchmarks are not quite so bad. fill_assign.pdf fill.pdf

giovannic commented 2 years ago

woooow, dramatic. I wonder if a fill constructor and assignment is even faster. Something like...

v = std::vector<t>(size, value)

So all the other operations are faster for the shadow variable implementation now?

slwu89 commented 2 years ago

So all the other operations are faster for the shadow variable implementation now?

Not always! It still follows the pattern I saw earlier. Example here, for variable assign (single value, using bitset as index): https://github.com/slwu89/individual/blob/feat/variable-enhancements/tests/performance/sv_bi_assign.pdf

For small model sizes (1e3 - 1e4 range), they are about the same, with the shadow variable sometimes a bit faster. In the 1e6 population size there's this interesting phenomenon where when the number of elements being updated is small, the existing implementation is quite a bit faster than the shadow one. However by the time we are updating ~50% of elements, the shadow variable is much faster, and by the time 90% of elements are being updated, the shadow variable is completely faster than the existing implementation.

I'll try out the fill ctor + assign after #156 is done!