SERVER-80273 Implement tie breaking performance tests

aligusnet commented 1 year ago

The test results in op/sec. with the tie breaking knob enabled and disabled.

TieBreaking.Longest Index Prefix
8       513.8479528896214
Regular.Longest Index Prefix
8       44.694536921251505
TieBreaking.Equality
8       506.8080551614268
Regular.Equality
8       161.7573477034724
TieBreaking.Shortest Index
8       510.61094659644266
Regular.Shortest Index
8       429.83690121885473
TieBreaking.Shortest Index With Comparisons
8       241.43640952463656
Regular.Shortest Index With Comparisons
8       237.93900544287968
TieBreaking.Not Broken Tie
8       520.1677700619116
Regular.Not Broken Tie
8       501.3517774484123
TieBreaking.Multi Interval Index Bounds
8       1418.8523158938228
Regular.Multi Interval Index Bounds
8       589.0653669061361
TieBreaking.Non-Blocking Sort
8       1399.3737582655758
Regular.Non-Blocking Sort
8       605.9069944737929
TieBreaking.Blocking Sort
8       1293.7228272758164
Regular.Blocking Sort
8       563.8432802989446
TieBreaking.Multi IndexScans
8       1269.5654745325644
Regular.Multi IndexScans
8       530.3681668955684
TieBreaking.No Tie
8       469.9333380519399
Regular.No Tie
8       487.1500466012959
TieBreaking.Docs Examined
8       2842.6274584899634
Regular.Docs Examined
8       2997.623491292798

alyacb commented 1 year ago

Actually, last last request: can you update the numbers in the PR with the larger document size? The numbers look quite similar in most cases. I'm surprised that the plans would result in such similar throughputs.

aligusnet commented 1 year ago

@alyacb Do you mean that not all the tests demonstrated performance boost with the heuristics? It really depends not on the document size, but on how the data distributed. I can imagine, that in some cases, e.g. for the query {a: 1, b: 1, c: 1} and the indexes {a: 1, b: 1} and {c: 1} we would prefer {a: 1, b: 1}, but {c: 1} might be more selective, and instead of the performance improvement we get performance degradation. There are similar issues with CE, sometimes CE can give misleading results and poorer performance.

alyacb commented 1 year ago

@aligusnet I'm surprised none of the tests above show any significant difference (>10%) in throughput, except for this one, where tie breaking is doing better, as expected:

TieBreaking.Equality
8       53.38707500884039
Regular.Equality
8       21.28536077650503

I want to be sure that:

we're actually tie breaking/not picking the first EOF plan in this performance test
that we design this test to specifically exercise tie breaking in the beneficial case, so that we can detect regressions in this case in future
we have throughput numbers that support this optimization being a positive one, though I agree with your point that it will be worse in some cases, like any heuristic

aligusnet commented 1 year ago

@alyacb What I wanted to demonstrate here is that the heuristics did not make the plans worse in the cases when they shouldn't (in other words not too expensive to apply) and that sometimes they can give performance boost.

If you believe that only cases with performance boost count I can remove others. Only predicates with inequalities allow to demonstrate reliable performance boost because they can guarantee order of the documents. In queries with equalities only we cannot guarantee the order and, therefore, cannot guarantee that we always demonstrate performance improvement,

aligusnet commented 1 year ago

@alyacb I have replaced the equality tests with inequality once to better demonstrate performance boosts

mongodb / mongo-perf

SERVER-80273 Implement tie breaking performance tests #138