Closed jonasnoki closed 4 years ago
Questions:
IMHO, it would be nicer to align the parameter order with table_generator->generate_table, i.e., keep row_count and then chunk_size in the signature as is and rather change the calls in the benchmark variants. This way, we do not have methods with different orders on these two types which can be a little confusing.
- Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?
@bakoe: Are they really? If I am not mistaken, value segments are faster (i.e., have more iterations) than reference segments in the above result table.
- Why is the SortNullBenchmark faster than the regular SortBenchmark?
- Maybe because the filtering during input materialisation leads to less values to be sorted.
@bakoe: Yes, this is a fair assumption.
- Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?
@bakoe: Are they really? If I am not mistaken, value segments are faster (i.e., have more iterations) than reference segments in the above result table.
Yes sorry, we should probably have been a bit more precise on that: In general value segments are faster. But the exception is the Large
benchmark where value segments have only 5 and reference segments reach 6 iterations.
- Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?
@bakoe: Are they really? If I am not mistaken, value segments are faster (i.e., have more iterations) than reference segments in the above result table.
Yes sorry, we should probably have been a bit more precise on that: In general value segments are faster. But the exception is the
Large
benchmark where value segments have only 5 and reference segments reach 6 iterations.
I ran the benchmark myself a couple of times. I believe that this is merely variation in the measurements.
Improved order of execution:
Bastian.Koenig@vm-appleton:~/hyrise$ cmake-build-release/hyriseMicroBenchmarks --benchmark_filter=BM_Sort*
2020-01-15 16:36:21
Running cmake-build-release/hyriseMicroBenchmarks
Run on (32 X 2500 MHz CPU s)
CPU Caches:
L1 Data 32K (x32)
L1 Instruction 32K (x32)
L2 Unified 256K (x32)
L3 Unified 25600K (x32)
Load Average: 0.30, 1.39, 3.08
--------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------
SortPicoBenchmark/BM_SortPico 3082 ns 3081 ns 207342
SortSmallBenchmark/BM_SortSmall 665147 ns 665067 ns 954
SortBenchmark/BM_Sort 10685593 ns 10672948 ns 83
SortLargeBenchmark/BM_SortLarge 69531441 ns 69522634 ns 10
SortReferencePicoBenchmark/BM_SortReferencePico 3777 ns 3777 ns 188021
SortReferenceSmallBenchmark/BM_SortReferenceSmall 825497 ns 824944 ns 1074
SortReferenceBenchmark/BM_SortReference 16163228 ns 16162516 ns 57
SortReferenceLargeBenchmark/BM_SortReferenceLarge 115197849 ns 115195690 ns 5
SortStringSmallBenchmark/BM_SortStringSmall 2794357 ns 2794306 ns 315
SortStringBenchmark/BM_SortString 26671309 ns 26670467 ns 26
SortStringLargeBenchmark/BM_SortStringLarge 568932772 ns 568549306 ns 1
SortNullBenchmark/BM_SortNullBenchmark 7789183 ns 7788042 ns 114
SortBenchmark/BM_SortSingleColumnSQL 12163768 ns 12163122 ns 63
[PERF] Multiple ORDER BYs are executed one-by-one at src/lib/logical_query_plan/lqp_translator.cpp:272
Performance can be affected. This warning is only shown once.
SortBenchmark/BM_SortMultiColumnSQL 27665274 ns 27664767 ns 21
Closes #9