apache / arrow-nanoarrow

Helpers for Arrow C Data & Arrow C Stream interfaces
https://arrow.apache.org/nanoarrow
Apache License 2.0
169 stars 35 forks source link

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

Closed paleolimbot closed 6 months ago

paleolimbot commented 7 months ago

I imagine there are a few ways to go about this, but I found moving the benchmarks to their own subdirectory and using FetchContent to build against various versions/source checkouts to be an intuitive way to do this. This also nicely separates benchmark-related CMake from non-benchmark related CMake and provides a nice way to benchmark locally against a few previous versions (via build presets). If we add more benchmarks in the future (or discover a flaw in an existing benchmark), it also provides a nice way to retrospectively run them against previous releases.

I've added a more verbose description of the setup to the benchmarks README, but the general idea is:

Example report in details below:

# Benchmark Report ## Configurations These benchmarks were run with the following configurations: | preset_name | preset_description | |:------------|:-------------------------------------------------| | local | Uses the nanoarrow C sources from this checkout. | | v0.4.0 | Uses the nanoarrow C sources the 0.4.0 release. | ## Summary A quick and dirty summary of benchmark results between this checkout and the last released version. | benchmark_label | v0.4.0 | local | change | pct_change | |:----------------------------------------------------------------------------|---------:|---------:|--------:|-----------:| | [ArrayViewGetIntUnsafeInt16](#arrayviewgetintunsafeint16) | 635.33µs | 631.47µs | 1ns | -0.6% | | [ArrayViewGetIntUnsafeInt32](#arrayviewgetintunsafeint32) | 635.96µs | 636.71µs | 753.7ns | 0.1% | | [ArrayViewGetIntUnsafeInt64](#arrayviewgetintunsafeint64) | 669.22µs | 680.5µs | 11.3µs | 1.7% | | [ArrayViewGetIntUnsafeInt64CheckNull](#arrayviewgetintunsafeint64checknull) | 1.03ms | 1.21ms | 178.7µs | 17.4% | | [ArrayViewGetIntUnsafeInt8](#arrayviewgetintunsafeint8) | 948.13µs | 946.34µs | 1ns | -0.2% | | [SchemaInitWideStruct](#schemainitwidestruct) | 1.04ms | 1.02ms | 1ns | -2.1% | | [SchemaViewInitWideStruct](#schemaviewinitwidestruct) | 106.08µs | 104.56µs | 1ns | -1.4% | ## ArrowArrayView-related benchmarks Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX() functions. ### ArrayViewGetIntUnsafeInt8 Use ArrowArrayViewGetIntUnsafe() to consume an int8 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L108-L110) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 746 | 946µs | 945µs | 1,058,678,610 | | v0.4.0 | 745 | 948µs | 947µs | 1,056,345,018 | ### ArrayViewGetIntUnsafeInt16 Use ArrowArrayViewGetIntUnsafe() to consume an int16 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L113-L115) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1115 | 631µs | 630µs | 1,586,161,276 | | v0.4.0 | 1110 | 635µs | 634µs | 1,576,482,853 | ### ArrayViewGetIntUnsafeInt32 Use ArrowArrayViewGetIntUnsafe() to consume an int32 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L118-L120) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1106 | 637µs | 636µs | 1,572,865,930 | | v0.4.0 | 1116 | 636µs | 635µs | 1,574,396,587 | ### ArrayViewGetIntUnsafeInt64 Use ArrowArrayViewGetIntUnsafe() to consume an int64 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L123-L125) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1036 | 680µs | 680µs | 1,471,241,907 | | v0.4.0 | 1039 | 669µs | 668µs | 1,496,471,266 | ### ArrayViewGetIntUnsafeInt64CheckNull Use ArrowArrayViewGetIntUnsafe() to consume an int64 array (checking for nulls) [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L128-L130) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 581 | 1.21ms | 1.2ms | 830,641,968 | | v0.4.0 | 697 | 1.03ms | 1.02ms | 976,185,007 | ## Schema-related benchmarks Benchmarks for producing and consuming ArrowSchema. ### SchemaInitWideStruct Benchmark ArrowSchema creation for very wide tables. Simulates part of the process of creating a very wide table with a simple column type (integer). [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L45-L56) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 684 | 1.02ms | 1.02ms | 9,788,166 | | v0.4.0 | 686 | 1.04ms | 1.04ms | 9,606,888 | ### SchemaViewInitWideStruct Benchmark ArrowSchema parsing for very wide tables. Simulates part of the process of consuming a very wide table. Typically the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a similar pattern. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L78-L91) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 6753 | 105µs | 104µs | 95,812,784 | | v0.4.0 | 6762 | 106µs | 106µs | 94,630,337 |
codecov-commenter commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 88.74%. Comparing base (5756b76) to head (a0363c3). Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #398 +/- ## ======================================= Coverage 88.74% 88.74% ======================================= Files 81 81 Lines 14398 14398 ======================================= Hits 12778 12778 Misses 1620 1620 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.