Closed zhongshanhao closed 1 month ago
I add relevant test task AndManyTermsWithLowHits
on wikimediumall
dataset.
These query term on AndManyTermsWithLowHits
has a large doc frequency, but the hits is small.
# Conjunction of 2 terms and 2 stop words
And2Terms2StopWords: +lord +of +the +rings
And2Terms2StopWords: +the +book +of +life
And2Terms2StopWords: +the +garden +of +eden
And2Terms2StopWords: +battle +of +the +bulge
And2Terms2StopWords: +story +of +a +girl
And2Terms2StopWords: +lord +if +the +rings
And2Terms2StopWords: +battle +if +the +bulge
# AndManyTermsWithLowHits
AndManyTermsWithLowHits: +battle +if +the +bulge # 82hits
AndManyTermsWithLowHits: +lord +if +the +rings # 683hits
AndManyTermsWithLowHits: +the +book +of +life +story +of +a +girl # 563hits
AndManyTermsWithLowHits: +the +book +of +life +story +of +a +girl +battle +if +the +bulge # 0hits
# Conjunction of 3+ terms that are all stop words
AndStopWords: +to +be +or +not +to +be
AndStopWords: +who +are +the +who
# Conjunction of 3 terms
And3Terms: +new +york +population
And3Terms: +world +bank +president
And3Terms: +national +book +award
And3Terms: +united +states +constitution
And3Terms: +law +school +rankings
The benchmark is as follows.
AndManyTermsWithLowHits
show that the PR works as expected:
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
AndStopWords 5.56 (2.6%) 5.42 (2.6%) -2.6% ( -7% - 2%) 0.002
And3Terms 51.29 (1.9%) 50.71 (2.7%) -1.1% ( -5% - 3%) 0.124
And2Terms2StopWords 61.31 (1.9%) 60.80 (2.5%) -0.8% ( -5% - 3%) 0.245
PKLookup 134.41 (2.3%) 134.62 (3.5%) 0.2% ( -5% - 6%) 0.867
AndManyTermsWithLowHits 108.28 (1.4%) 136.71 (1.8%) 26.3% ( 22% - 29%) 0.000
@jpountz Can you help me review the code? Thank you so much.
@jpountz Yeah, I was overthinking it. Implementing it your way makes it much clearer😊. I have made revisions and committed it.
@jpountz Can you help me merge the PR? I can't merge this PR because I don't have write access to this repository. :)
@zhongshanhao Are you still observing a speedup with the latest version of the change? I was planning on merging once you confirmed this.
@jpountz Yes. I run the benchmark again with the latest version of the change. The benchmark on wikimediumall is as follow:
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
AndStopWords 19.82 (4.3%) 19.57 (2.6%) -1.2% ( -7% - 5%) 0.439
And3Terms 162.75 (2.3%) 161.19 (2.6%) -1.0% ( -5% - 4%) 0.390
PKLookup 155.11 (2.6%) 153.91 (2.4%) -0.8% ( -5% - 4%) 0.483
And2Terms2StopWords 153.85 (3.8%) 155.19 (3.9%) 0.9% ( -6% - 8%) 0.611
AndManyTermsWithLowHits 333.94 (2.7%) 410.77 (4.3%) 23.0% ( 15% - 30%) 0.000
AndManyTermsWithLowHits
still show that the latest version PR works as expected.
Benchmark using other tasks are as follows:
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
PKLookup 78.31 (33.1%) 56.80 (11.8%) -27.5% ( -54% - 26%) 0.081
IntNRQ 45.76 (20.9%) 38.27 (20.9%) -16.4% ( -48% - 32%) 0.216
AndHighLow 720.91 (13.3%) 616.65 (16.9%) -14.5% ( -39% - 18%) 0.132
OrHighNotHigh 88.80 (3.7%) 82.75 (15.5%) -6.8% ( -25% - 12%) 0.340
HighIntervalsOrdered 3.32 (5.6%) 3.17 (3.7%) -4.5% ( -13% - 5%) 0.137
OrNotHighLow 300.34 (3.4%) 291.35 (7.3%) -3.0% ( -13% - 7%) 0.407
TermDTSort 19.28 (6.3%) 18.74 (8.8%) -2.8% ( -16% - 13%) 0.561
MedTerm 80.45 (10.0%) 78.26 (4.1%) -2.7% ( -15% - 12%) 0.572
MedIntervalsOrdered 43.17 (1.3%) 42.19 (1.3%) -2.3% ( -4% - 0%) 0.007
BrowseDayOfYearTaxoFacets 5.71 (2.6%) 5.60 (0.5%) -1.9% ( -4% - 1%) 0.104
Wildcard 114.70 (2.9%) 112.91 (1.9%) -1.6% ( -6% - 3%) 0.310
OrNotHighMed 86.73 (4.1%) 85.55 (8.6%) -1.4% ( -13% - 11%) 0.749
Fuzzy2 25.34 (31.0%) 25.06 (37.0%) -1.1% ( -52% - 96%) 0.959
LowTerm 287.13 (7.4%) 284.18 (6.9%) -1.0% ( -14% - 14%) 0.820
BrowseDayOfYearSSDVFacets 4.55 (1.7%) 4.51 (2.0%) -0.9% ( -4% - 2%) 0.454
MedSloppyPhrase 27.02 (2.4%) 26.79 (4.3%) -0.9% ( -7% - 5%) 0.696
Prefix3 227.46 (14.9%) 226.18 (3.4%) -0.6% ( -16% - 20%) 0.934
MedTermDayTaxoFacets 15.45 (2.1%) 15.40 (1.1%) -0.3% ( -3% - 2%) 0.752
OrHighNotLow 57.15 (6.0%) 56.99 (2.4%) -0.3% ( -8% - 8%) 0.923
HighTermDayOfYearSort 22.90 (10.7%) 22.84 (12.4%) -0.3% ( -21% - 25%) 0.972
AndHighHighDayTaxoFacets 9.48 (0.5%) 9.46 (0.4%) -0.3% ( -1% - 0%) 0.391
BrowseMonthTaxoFacets 8.64 (1.6%) 8.62 (1.4%) -0.2% ( -3% - 2%) 0.862
BrowseMonthSSDVFacets 3.50 (0.1%) 3.50 (0.2%) -0.1% ( 0% - 0%) 0.640
LowSpanNear 7.78 (1.5%) 7.78 (4.5%) 0.0% ( -5% - 6%) 0.990
LowPhrase 16.66 (1.9%) 16.67 (0.6%) 0.0% ( -2% - 2%) 0.971
OrHighNotMed 97.21 (2.2%) 97.37 (2.9%) 0.2% ( -4% - 5%) 0.921
BrowseRandomLabelSSDVFacets 2.08 (2.3%) 2.08 (0.3%) 0.2% ( -2% - 2%) 0.835
HighTermTitleSort 14.02 (7.0%) 14.05 (3.8%) 0.2% ( -9% - 11%) 0.948
HighTerm 75.20 (5.0%) 75.44 (6.6%) 0.3% ( -10% - 12%) 0.932
LowSloppyPhrase 13.04 (2.8%) 13.11 (2.6%) 0.5% ( -4% - 6%) 0.759
AndHighMed 107.72 (3.3%) 108.45 (4.9%) 0.7% ( -7% - 9%) 0.798
HighSpanNear 16.41 (2.2%) 16.53 (3.8%) 0.7% ( -5% - 6%) 0.711
LowIntervalsOrdered 21.05 (1.8%) 21.20 (2.3%) 0.7% ( -3% - 4%) 0.574
AndHighMedDayTaxoFacets 18.35 (1.0%) 18.56 (0.9%) 1.2% ( 0% - 3%) 0.048
OrHighMed 165.43 (1.5%) 167.36 (2.0%) 1.2% ( -2% - 4%) 0.287
MedPhrase 33.51 (1.8%) 33.92 (1.2%) 1.2% ( -1% - 4%) 0.200
OrHighMedDayTaxoFacets 2.75 (0.7%) 2.78 (1.7%) 1.3% ( -1% - 3%) 0.120
HighTermTitleBDVSort 7.97 (5.4%) 8.07 (7.9%) 1.3% ( -11% - 15%) 0.760
BrowseDateSSDVFacets 0.74 (0.7%) 0.76 (0.7%) 1.8% ( 0% - 3%) 0.000
Fuzzy1 68.53 (4.9%) 70.19 (6.6%) 2.4% ( -8% - 14%) 0.506
OrHighHigh 48.03 (7.6%) 49.21 (4.3%) 2.5% ( -8% - 15%) 0.528
OrHighLow 185.87 (3.7%) 191.27 (2.6%) 2.9% ( -3% - 9%) 0.145
HighSloppyPhrase 3.21 (0.7%) 3.32 (1.2%) 3.6% ( 1% - 5%) 0.000
HighPhrase 161.54 (4.2%) 167.84 (8.5%) 3.9% ( -8% - 17%) 0.360
BrowseRandomLabelTaxoFacets 4.67 (3.3%) 4.85 (4.3%) 3.9% ( -3% - 11%) 0.103
MedSpanNear 30.56 (3.4%) 31.87 (3.7%) 4.3% ( -2% - 11%) 0.056
OrNotHighHigh 76.97 (4.4%) 81.15 (6.0%) 5.4% ( -4% - 16%) 0.104
BrowseDateTaxoFacets 5.69 (7.3%) 6.08 (6.3%) 6.9% ( -6% - 22%) 0.111
Respell 26.54 (23.9%) 28.40 (31.3%) 7.0% ( -38% - 81%) 0.692
HighTermMonthSort 191.61 (17.0%) 205.52 (22.7%) 7.3% ( -27% - 56%) 0.566
AndHighHigh 71.76 (11.5%) 78.62 (2.8%) 9.6% ( -4% - 26%) 0.072
Sometime, due to the need to decode impact and calculate the maximum score,
ImpactsDISI
typically adds more overhead than it enables skipping.Let's talk the query:
These term(a, b, c, d) has a large doc frequency.
Maybe the query result set is small, not even a minimum competition score is produced,
BlockMaxConjunctionBulkScorer
andBlockMaxConjunctionScorer
still try to get max score at the beginning of theadvance
.This PR is designed to solve this problem, to advoid the use of
ImpactsDISI
when no minimum competitive score has been set.Here are the benchmark of this PR on wikimediumall.
iter 4:
The result of benchmark does not seem to add some optimization. 🤔
Should I add relevant test cases?