Closed rmuir closed 3 years ago
Just noting that benchUtil.py does have these options: {{-store}} ("store body fields") and {{-loadStoredFields}}. Maybe they could be used along with other existing tasks? Still, a dedicated test for stored fields would probably be better so you could separate out the impacts.
@msokolov I think the issue with enabling stored fields for existing tasks, is that they easily get "drowned out" by other features: this applies to index time, merge time, and retrieval time.
If things like postings lists, points, docvalues, vectors, etc are enabled, we may see no change to such benchmarks when there is a performance regression to the stored fields.
Additionally, stored fields have tricky bulk merge optimizations that can also hide bad performance.
That's why I wrote a standalone benchmark to target the relevant code when debugging https://issues.apache.org/jira/browse/LUCENE-9827. It ran over 10x faster for BEST_SPEED after we fixed the issue! (24x faster with BEST_COMPRESSION).
The benchmark does a few things to zero in on stored fields perf:
I think as a start we should simple run @rmuir's cool pure stored fields benchmarks in nightly benchmarks? We could run the benchmark (about how long does it take?) and record the resulting index size. It's a good start. But it is indexing only?
Maybe instead of flushing after each doc we could let IW do its thing, and then we could (fairly) report indexing throughput too, so we could see impact of changes that improve indexing performance for stored fields.
Also I assume you are granting ASL2 for it :) Thank you!
we could let IW do its thing
Unfortunately this would most likely create a single segment since stored fields don't consume per-document heap in IndexingChain, so this would miss to catch any merge-time regressions like the one that Robert created this benchmark for.
It is true that the benchmark is indexing only, but that's also where the code complexity is: especially merging! And yeah, in such benchmarks if we want merging to happen, we kinda have to force it to happen explicitly.
There are also other variants of the benchmark we could use in the future:
But +1 to start with something.
Thank you for all the progress here @jpountz and @rmuir! I'll take a stab at Python wrapper to get these results into each nightly run.
It's been added to nightly benchmarks so it looks like we are all set?
Currently nothing really prevents regressions in the stored fields.
See discussion: https://issues.apache.org/jira/browse/LUCENE-10170
Here's my geonames-based benchmark, that i wrote up the last time I had to debug the stored fields. Maybe we can get it into a PR and hooked in somehow to a chart in the nightly bench.