Open jbellis opened 1 year ago
FWIW: This relates at least partially to specialization and efficiency of the inner loops. I suspect there are ways to use some generic parameters to still get specialization, but some of the boilerplate exists because it gets inlined away and creates simpler inner loops. For instance, aggregate_since
has conditionals that aggregate
doesn't. In the case where the conditional isn't there, I believe we've seen the loop get auto-vectorized, but it won't if the conditional is there.
I think we absolutely should revisit to see if there are ways to reduce boilerplate and duplication. But we should also benchmark (and possibly look at generated assembly for some of the critical loops) and make sure we don't regress the potential for that to be vectorized, etc.
it looks like there's a ton of boilerplate involved in creating an aggregation function.
aggregate
is substantially identical across first_string, last_string, top_string;evaluate
is identical across even more functions, andaggregate_since
has duplication both across different functions, and also wrtaggregate
in the same function.additionally, it's not obvious why there are similar implementations for X and two_stacks_X for many of the functions.