Run nightly benchmarks with a single thread executor.

jpountz commented 5 months ago

We're making concurrenty search more of a first-class citizen, and hopefully it will soon even be the default in Lucene. So let's start better exercising code paths for concurrent search in nightly benchmarks by configuring an executor, but bounding it to a single thread for now to contain noise.

Closes #210

mikemccand commented 5 months ago

Hmm maybe we should use 2 or 3 threads intra-query concurrency? The nightly only runs 6 concurrent queries, and beast3 has 128 cores ... prolly noise would be fine? And we'd better exercise the concurrent querying and maybe better find thread sync performance bugs?

jpountz commented 5 months ago

@mikemccand It looks like nightly benchmarks need regolding as enabling concurrency disabled some optimizations, which in-turns causes different hit counts to be returned. This is especially visible for sorted queries, which are unable to share information about competitive hits across slices.

WARNING: cat=And2Terms2StopWords: hit counts differ: 6354+ vs 7384+
WARNING: cat=And3Terms: hit counts differ: 7688+ vs 9654+
WARNING: cat=AndHighHigh: hit counts differ: 112763+ vs 140641+
WARNING: cat=AndHighMed: hit counts differ: 47357+ vs 67508+
WARNING: cat=AndHighOrMedMed: hit counts differ: 13186+ vs 16119+
WARNING: cat=AndMedOrHighHigh: hit counts differ: 701322+ vs 783613+
WARNING: cat=AndStopWords: hit counts differ: 4890+ vs 5866+
WARNING: cat=Fuzzy1: hit counts differ: 10699+ vs 11771+
WARNING: cat=Fuzzy2: hit counts differ: 10316+ vs 11884+
WARNING: cat=IntNRQ: hit counts differ: 5005+ vs 5505+
WARNING: cat=Or2Terms2StopWords: hit counts differ: 16688+ vs 17155+
WARNING: cat=Or3Terms: hit counts differ: 36417+ vs 37414+
WARNING: cat=OrHighHigh: hit counts differ: 115453+ vs 144984+
WARNING: cat=OrHighMed: hit counts differ: 42288+ vs 43945+
WARNING: cat=OrHighRare: hit counts differ: 77498+ vs 104663+
WARNING: cat=OrStopWords: hit counts differ: 7221+ vs 8170+
WARNING: cat=Phrase: hit counts differ: 21112+ vs 25029+
WARNING: cat=Prefix3: hit counts differ: 5005+ vs 5505+
WARNING: cat=SloppyPhrase: hit counts differ: 2503587+ vs 2814938+
WARNING: cat=Term: hit counts differ: 218339+ vs 236936+
WARNING: cat=TermDTSort: hit counts differ: 830782+ vs 1564994+
WARNING: cat=TermDayOfYearSort: hit counts differ: 58184+ vs 574062+
WARNING: cat=TermMonthSort: hit counts differ: 41344+ vs 20427+
WARNING: cat=TermTitleSort: hit counts differ: 539542+ vs 3524834+
WARNING: cat=Wildcard: hit counts differ: 5005+ vs 5505+

mikemccand commented 5 months ago

Woops, I will kick off regolding run now!

mikemccand commented 4 months ago

Regold failed ... I will debug!

mikemccand / luceneutil

Run nightly benchmarks with a single thread executor. #271