Reorder GenerateCustomTable (switch row_count and chunk_size) in Sort MicroBenchmark

jonasnoki commented 4 years ago

Closes #9

Bastian.Koenig@vm-appleton:~/hyrise$ cmake-build-release/hyriseMicroBenchmarks --benchmark_filter=BM_Sort*
2020-01-12 22:11:35
Running cmake-build-release/hyriseMicroBenchmarks
Run on (32 X 2500 MHz CPU s)
CPU Caches:
  L1 Data 32K (x32)
  L1 Instruction 32K (x32)
  L2 Unified 256K (x32)
  L3 Unified 25600K (x32)
Load Average: 0.08, 0.07, 0.02
--------------------------------------------------------------------------------------------
Benchmark                                                  Time             CPU   Iterations
--------------------------------------------------------------------------------------------
SortBenchmark/BM_Sort                                9927092 ns      9918420 ns           83
SortBenchmark/BM_SortSingleColumnSQL                11697463 ns     11697524 ns           60
[PERF] Multiple ORDER BYs are executed one-by-one at src/lib/logical_query_plan/lqp_translator.cpp:272
    Performance can be affected. This warning is only shown once.

SortBenchmark/BM_SortMultiColumnSQL                 25168453 ns     25168143 ns           21
SortPicoBenchmark/BM_SortPico                           3108 ns         3108 ns       224457
SortSmallBenchmark/BM_SortSmall                       708159 ns       708099 ns          860
SortLargeBenchmark/BM_SortLarge                    112134552 ns    112126522 ns            5
SortReferenceBenchmark/BM_SortReference             16079788 ns     16079781 ns           56
SortReferencePicoBenchmark/BM_SortReferencePico         3716 ns         3716 ns       194495
SortReferenceSmallBenchmark/BM_SortReferenceSmall    1045074 ns      1044611 ns          505
SortReferenceLargeBenchmark/BM_SortReferenceLarge  115130266 ns    115042670 ns            6
SortStringBenchmark/BM_SortString                   26059250 ns     26058880 ns           24
SortStringSmallBenchmark/BM_SortStringSmall          2898024 ns      2897975 ns          293
SortStringLargeBenchmark/BM_SortStringLarge        588391781 ns    588380760 ns            1
SortNullBenchmark/BM_SortNullBenchmark               7887901 ns      7887674 ns          114

bakoe commented 4 years ago

Questions:

Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?
Why is the SortNullBenchmark faster than the regular SortBenchmark?
- Maybe because the filtering during input materialisation leads to less values to be sorted.

tobodner commented 4 years ago

IMHO, it would be nicer to align the parameter order with table_generator->generate_table, i.e., keep row_count and then chunk_size in the signature as is and rather change the calls in the benchmark variants. This way, we do not have methods with different orders on these two types which can be a little confusing.

tobodner commented 4 years ago

Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?

@bakoe: Are they really? If I am not mistaken, value segments are faster (i.e., have more iterations) than reference segments in the above result table.

tobodner commented 4 years ago

Why is the SortNullBenchmark faster than the regular SortBenchmark?

Maybe because the filtering during input materialisation leads to less values to be sorted.

@bakoe: Yes, this is a fair assumption.

birneamstiel commented 4 years ago

Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?

@bakoe: Are they really? If I am not mistaken, value segments are faster (i.e., have more iterations) than reference segments in the above result table.

Yes sorry, we should probably have been a bit more precise on that: In general value segments are faster. But the exception is the Large benchmark where value segments have only 5 and reference segments reach 6 iterations.

tobodner commented 4 years ago

Why are the benchmarks for ReferenceSegment faster than the ones for "regular" segments?

@bakoe: Are they really? If I am not mistaken, value segments are faster (i.e., have more iterations) than reference segments in the above result table.

Yes sorry, we should probably have been a bit more precise on that: In general value segments are faster. But the exception is the Large benchmark where value segments have only 5 and reference segments reach 6 iterations.

I ran the benchmark myself a couple of times. I believe that this is merely variation in the measurements.

jonasnoki commented 4 years ago

Improved order of execution:

Bastian.Koenig@vm-appleton:~/hyrise$ cmake-build-release/hyriseMicroBenchmarks --benchmark_filter=BM_Sort*
2020-01-15 16:36:21
Running cmake-build-release/hyriseMicroBenchmarks
Run on (32 X 2500 MHz CPU s)
CPU Caches:
  L1 Data 32K (x32)
  L1 Instruction 32K (x32)
  L2 Unified 256K (x32)
  L3 Unified 25600K (x32)
Load Average: 0.30, 1.39, 3.08
--------------------------------------------------------------------------------------------
Benchmark                                                  Time             CPU   Iterations
--------------------------------------------------------------------------------------------
SortPicoBenchmark/BM_SortPico                           3082 ns         3081 ns       207342
SortSmallBenchmark/BM_SortSmall                       665147 ns       665067 ns          954
SortBenchmark/BM_Sort                               10685593 ns     10672948 ns           83
SortLargeBenchmark/BM_SortLarge                     69531441 ns     69522634 ns           10
SortReferencePicoBenchmark/BM_SortReferencePico         3777 ns         3777 ns       188021
SortReferenceSmallBenchmark/BM_SortReferenceSmall     825497 ns       824944 ns         1074
SortReferenceBenchmark/BM_SortReference             16163228 ns     16162516 ns           57
SortReferenceLargeBenchmark/BM_SortReferenceLarge  115197849 ns    115195690 ns            5
SortStringSmallBenchmark/BM_SortStringSmall          2794357 ns      2794306 ns          315
SortStringBenchmark/BM_SortString                   26671309 ns     26670467 ns           26
SortStringLargeBenchmark/BM_SortStringLarge        568932772 ns    568549306 ns            1
SortNullBenchmark/BM_SortNullBenchmark               7789183 ns      7788042 ns          114
SortBenchmark/BM_SortSingleColumnSQL                12163768 ns     12163122 ns           63
[PERF] Multiple ORDER BYs are executed one-by-one at src/lib/logical_query_plan/lqp_translator.cpp:272
    Performance can be affected. This warning is only shown once.

SortBenchmark/BM_SortMultiColumnSQL                 27665274 ns     27664767 ns           21

bakoe / hyrise

Reorder GenerateCustomTable (switch row_count and chunk_size) in Sort MicroBenchmark #10