chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions

I imagine there are a few ways to go about this, but I found moving the benchmarks to their own subdirectory and using FetchContent to build against various versions/source checkouts to be an intuitive way to do this. This also nicely separates benchmark-related CMake from non-benchmark related CMake and provides a nice way to benchmark locally against a few previous versions (via build presets). If we add more benchmarks in the future (or discover a flaw in an existing benchmark), it also provides a nice way to retrospectively run them against previous releases.

I've added a more verbose description of the setup to the benchmarks README, but the general idea is:

Benchmarks are documented using Doxygen, which is really good at parsing documentation. Reading the XML is a bit of a pain but is better than undocumented or difficult-to-locate benchmarks and better than parsing source files yourself.
Configurations are CMake build presets, and CMake handles pulling a previous or local nanoarrow using FetchContent. This means that the only action needed on release to update the report is to add a configure preset.
The provided benchmark-run-all.sh effectively reuses build directories for minimal rebuilding during benchmark development.
The report is a Quarto document that renders to markdown. It is not the flashiest of reports but gets the job done. It could be replaced by something like conbench in the future.

Example report in details below:

# Benchmark Report ## Configurations These benchmarks were run with the following configurations: | preset_name | preset_description | |:------------|:-------------------------------------------------| | local | Uses the nanoarrow C sources from this checkout. | | v0.4.0 | Uses the nanoarrow C sources the 0.4.0 release. | ## Summary A quick and dirty summary of benchmark results between this checkout and the last released version. | benchmark_label | v0.4.0 | local | change | pct_change | |:----------------------------------------------------------------------------|---------:|---------:|--------:|-----------:| | [ArrayViewGetIntUnsafeInt16](#arrayviewgetintunsafeint16) | 635.33µs | 631.47µs | 1ns | -0.6% | | [ArrayViewGetIntUnsafeInt32](#arrayviewgetintunsafeint32) | 635.96µs | 636.71µs | 753.7ns | 0.1% | | [ArrayViewGetIntUnsafeInt64](#arrayviewgetintunsafeint64) | 669.22µs | 680.5µs | 11.3µs | 1.7% | | [ArrayViewGetIntUnsafeInt64CheckNull](#arrayviewgetintunsafeint64checknull) | 1.03ms | 1.21ms | 178.7µs | 17.4% | | [ArrayViewGetIntUnsafeInt8](#arrayviewgetintunsafeint8) | 948.13µs | 946.34µs | 1ns | -0.2% | | [SchemaInitWideStruct](#schemainitwidestruct) | 1.04ms | 1.02ms | 1ns | -2.1% | | [SchemaViewInitWideStruct](#schemaviewinitwidestruct) | 106.08µs | 104.56µs | 1ns | -1.4% | ## ArrowArrayView-related benchmarks Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX() functions. ### ArrayViewGetIntUnsafeInt8 Use ArrowArrayViewGetIntUnsafe() to consume an int8 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L108-L110) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 746 | 946µs | 945µs | 1,058,678,610 | | v0.4.0 | 745 | 948µs | 947µs | 1,056,345,018 | ### ArrayViewGetIntUnsafeInt16 Use ArrowArrayViewGetIntUnsafe() to consume an int16 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L113-L115) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1115 | 631µs | 630µs | 1,586,161,276 | | v0.4.0 | 1110 | 635µs | 634µs | 1,576,482,853 | ### ArrayViewGetIntUnsafeInt32 Use ArrowArrayViewGetIntUnsafe() to consume an int32 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L118-L120) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1106 | 637µs | 636µs | 1,572,865,930 | | v0.4.0 | 1116 | 636µs | 635µs | 1,574,396,587 | ### ArrayViewGetIntUnsafeInt64 Use ArrowArrayViewGetIntUnsafe() to consume an int64 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L123-L125) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1036 | 680µs | 680µs | 1,471,241,907 | | v0.4.0 | 1039 | 669µs | 668µs | 1,496,471,266 | ### ArrayViewGetIntUnsafeInt64CheckNull Use ArrowArrayViewGetIntUnsafe() to consume an int64 array (checking for nulls) [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L128-L130) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 581 | 1.21ms | 1.2ms | 830,641,968 | | v0.4.0 | 697 | 1.03ms | 1.02ms | 976,185,007 | ## Schema-related benchmarks Benchmarks for producing and consuming ArrowSchema. ### SchemaInitWideStruct Benchmark ArrowSchema creation for very wide tables. Simulates part of the process of creating a very wide table with a simple column type (integer). [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L45-L56) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 684 | 1.02ms | 1.02ms | 9,788,166 | | v0.4.0 | 686 | 1.04ms | 1.04ms | 9,606,888 | ### SchemaViewInitWideStruct Benchmark ArrowSchema parsing for very wide tables. Simulates part of the process of consuming a very wide table. Typically the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a similar pattern. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L78-L91) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 6753 | 105µs | 104µs | 95,812,784 | | v0.4.0 | 6762 | 106µs | 106µs | 94,630,337 |

apache / arrow-nanoarrow

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

Codecov Report