Refactor to allow benching vectorized code paths

onelson commented 2 years ago

For the purpose of verifying the ongoing vectorized optimization efforts, some updates are needed in executetest.ProcessBenchmarkHelper and executetest.FunctionExpression to allow us to configure feature flags so they will influence both the Go and Rust parts of flux (which now has feature flags of its own).

In addition to the feature flag concern, a memory allocator needs to be made available to the context that will ultimately be used in vectorized ops. Currently, it appears the benchmark helper creates an allocator, but this is not fed through.

My guess (unverified) is we need to attach the allocator to the context here: https://github.com/influxdata/flux/blob/da12f3561d44fb1dc57c6e9afca5754d9ddbd264/stdlib/universe/map_test.go#L1047

executetest.FunctionExpression will need access to the flagger info here: https://github.com/influxdata/flux/blob/da12f3561d44fb1dc57c6e9afca5754d9ddbd264/execute/executetest/compile.go#L44-L48

Ideally, we'd be able to add new benches here to stdlib/universe/map_test.go and in each case, we'd be able to toggle on/off the necessary flags required to hit the code path we're trying to measure.

In compiler_test.TestVectorizedFns we configure flaggers (optionally) per test case, and ensure the allocator is available. https://github.com/influxdata/flux/blob/da12f3561d44fb1dc57c6e9afca5754d9ddbd264/compiler/vectorized_test.go#L322-L342

Something similar should be done for the benchmarking code.

DOD:

Refactors to allow a flagger to be supplied for each individual benchmark
Update existing benchmarks in stdlib/universe/map_test.go (row-based and vectorized benches should probably give different results!)
Add new benchmarks for #4622

jsternberg commented 1 year ago

We might want to be careful about expanding use of ProcessBenchmarkHelper. I've generally had issues with it because a lot of transformations have a lot of components that are heavy but some don't matter. That makes it difficult to benchmark them with the standard go tooling and actually understand what they mean. Because they're dependent on data, such as shape and amount, it's difficult to draw correct conclusions from the results which is the entire point of benchmarking.

Benchmarking smaller components can end up being more effective. There's fewer moving parts and you don't have to understand if the specific part matters since you're benchmarking a singular component rather than an entire moving system. We might want to focus more on that than to utilize ProcessBenchmarkHelper more.

github-actions[bot] commented 1 year ago

This issue has had no recent activity and will be closed soon.

influxdata / flux

Refactor to allow benching vectorized code paths #4983