mikemccand / luceneutil

Various utility scripts for running Lucene performance tests
Apache License 2.0
203 stars 113 forks source link

Is it right that SORTED_SET faceting caches OrdinalMaps across invocations? #169

Closed jpountz closed 2 years ago

jpountz commented 2 years ago

I was looking at how the faceting module deals with OrdinalMap, and it looks like the OrdinalMap gets cached across faceting requests. This might not be representative of what happens on production systems where NRT reopens frequently invalidate the global OrdinalMap, so it has to be re-computed as part of the faceting request.

rmuir commented 2 years ago

why would this be done as part of the faceting request? why wouldn't you use a warmer, e.g. it would just delay the reopen time, but wouldnt block searches.

rmuir commented 2 years ago

(similar for building the huge arrays or whatever taxo facets does)

gsmiller commented 2 years ago

+1 to incorporating ordinal map refreshes as part of benchmarking. It makes sense to me to exercise reopens as part of the faceting work since taxonomy and SSDV faceting function very differently in this regard (taxonomy index and child/sibling/parent in-memory lookups vs. SSDV ordinal mapping).

why would this be done as part of the faceting request? why wouldn't you use a warmer, e.g. it would just delay the reopen time, but wouldnt block searches.

I think the issue here is that ordinal maps are lazily rebuilt (and then cached) as needed on a per-field basis. To do this as part of reopening would require eagerly building these maps for all SSDV faceting fields. But maybe that makes sense?

rmuir commented 2 years ago

Well, there's a real problem if we force some random poor search user to experience the whole ordinalmap hit, just because they were unlucky. It might even cause timeouts (e.g. HTTP 5xx) in some deployments.

We should make it easy to warm this stuff up. I get that solr and elastic servers already have a generic "warmup queries" that can solve the problem, but we could be more efficient with things like OrdinalMap. even if it doesn't require user configuration but just looks at observed faceting queries or something.

I think the ideal solution to this benchmarking issue would be to actually have a dedicated OrdinalMap-building benchmark? Perhaps we could make it super-stable just like the stored fields benchmark.

I see it really as a "tax" on your reopen time, e.g. it is more applicable to the NRT reopen benchmark than any search benchmark. it is useless noise there.

jpountz commented 2 years ago

why wouldn't you use a warmer, e.g. it would just delay the reopen time, but wouldnt block searches.

Agreed. The thing that confused me is that the API encourages OrdinalMap building to be part of the request for sorted-set doc-values faceting, which is what luceneserver seems to do (https://github.com/mikemccand/luceneserver/blob/84dd7b34b6e28367f95a31167e70f784af4618ce/src/java/org/apache/lucene/server/ShardState.java#L954-L997).

It would make more sense to me to manage this as part of the same ReferenceManager that handles the IndexReader, similarly to what the facet module is already doing in SearcherTaxonomyManager?

I think the ideal solution to this benchmarking issue would be to actually have a dedicated OrdinalMap-building benchmark?

+1

I get that solr and elastic servers already have a generic "warmup queries" that can solve the problem, but we could be more efficient with things like OrdinalMap

FWIW Elasticsearch has an option to warm up ordinal maps explicitly via the eager_global_ordinals option on mappings.

rmuir commented 2 years ago

the problem with these approaches, is it requires the luser to accurately predict which fields they will facet on.

So I'm still in favor of considering approaches that don't depend on idiot users, e.g. tracking which fields get used by facets and then warming the most recently used.

gsmiller commented 2 years ago

So I'm still in favor of considering approaches that don't depend on idiot users, e.g. tracking which fields get used by facets and then warming the most recently used.

+1, makes sense to me. I like the idea a lot of trying to build these maps when reopening the index. Looks like a good opportunity to make things smarter for users. I created a LUCENE-10565 to track the idea.

In the meantime, a dedicated benchmark for ordinal map building seems useful, but I also wonder if we shouldn't include it as part of the faceting benchmarks as well? As the implementation currently exists, it seems like we're sort of giving SSDV faceting an unfair advantage in the benchmarks when compared to taxonomy faceting doesn't it? It would be nice to create a "level playing field" between SSDV and taxo faceting for the purpose of benchmarking.

rmuir commented 2 years ago

taxonomy index builds huge int[] arrays, no?

rmuir commented 2 years ago

and sorry, for both cases (SSDV and taxonomy), we should be reducing noise. so let's not put stupid shit in the benchmark codepath unnecessarily, just to try to be "fair". Again, there's zero reason any search should ever be building an ordinalmap.

gsmiller commented 2 years ago

taxonomy index builds huge int[] arrays, no?

It does, and it can happen on the search path when faceting as far as I'm aware. So that also looks like something we could probably improve upon.

and sorry, for both cases (SSDV and taxonomy), we should be reducing noise. so let's not put stupid shit in the benchmark codepath unnecessarily, just to try to be "fair". Again, there's zero reason any search should ever be building an ordinalmap.

I think I hear what you're saying, but current users would be building ordinal maps on the search path when faceting, no? If a user of SSDV faceting reopens their index, they'll need to initialize a new SortedSetDocValuesReaderState, which means the next time they do SSDV faceting with that state, they'll get hit with an ordinal map building penalty. While this is far from ideal (as you've pointed out, thanks!), it's the reality today right?

rmuir commented 2 years ago

If its the case that there is no way for a user to do this "correctly", then it should be fixed. But that doesn't mean we have to copy that bug into our benchmark, too.

rmuir commented 2 years ago

Another way to put it, benchmarks should be reasonable. Rebuilding ordinal maps for every query is crazy, it doesn't reflect reality. nobody is doing this.

we could also write stuff to /proc/sys/vm/drop_caches, totally clearing system page cache, declaring that it isn't right that the operating system caches the index files across invocations of queries. But this is crazy to do and unreasonable.

jpountz commented 2 years ago

It looks like we do want to cache ordinal maps, however we also want to be able to warm up when the index is refreshed (LUCENE-10565) and introduce a benchmark for OrdinalMap construction (https://github.com/mikemccand/luceneutil/pull/170).

mikemccand commented 2 years ago

Agreed. The thing that confused me is that the API encourages OrdinalMap building to be part of the request for sorted-set doc-values faceting, which is what luceneserver seems to do (https://github.com/mikemccand/luceneserver/blob/84dd7b34b6e28367f95a31167e70f784af4618ce/src/java/org/apache/lucene/server/ShardState.java#L954-L997).

It would make more sense to me to manage this as part of the same ReferenceManager that handles the IndexReader, similarly to what the facet module is already doing in SearcherTaxonomyManager?

Yeah, that "lazy init" in luceneserver is terrible! Who on earth wrote that :) (answer: me, sigh)

The OrdinalMap really should be built during refresh, not in the query path. Apps should refresh less frequently if doing that is somehow problematic for their usage ... this is a good tradeoff when one refresh will use the OrdinalMap many times (for many queries), which is the normal case!

It looks like we do want to cache ordinal maps, however we also want to be able to warm up when the index is refreshed (LUCENE-10565) and introduce a benchmark for OrdinalMap construction (#170).

+1, thanks for the spinoff issue and benchmark improvement!

gsmiller commented 2 years ago

Another way to put it, benchmarks should be reasonable. Rebuilding ordinal maps for every query is crazy, it doesn't reflect reality. nobody is doing this.

Just to clarify, because I may not have explained my suggestion well enough, I certainly don't think it makes sense to refresh the ordinal map on every faceting request in the benchmarks. That's crazy indeed and doesn't at all represent what users would do! What I meant to suggest is doing a small number of index reopens during the faceting benchmark (maybe two or three) to simulate what users might run into in the wild when periodically hitting reopens. This would "bake in" the currently (not ideal) penalties of dumping the ordinal maps and having to lazily re-init them after reopens. Because we never reopen the index while running the benchmarks, the SSDV benchmark tasks probably appear a bit overly "rosy" relative to what a user might actually experience.

rmuir commented 2 years ago

i still don't think this should ever hit an end-user. it certainly wasn't designed for that. Hopefully today there is an easy way to either:

I'm glad we have an issue for the first one. But maybe the second approach is already mitigating the issue. I know generic "warming queries" are exposed as a feature by servers for stuff like this already.

rmuir commented 2 years ago

I'm also not a fan of introducing reopens into the current benchmarks as it mixes concerns. IMO all the benchmarks are a bit too noisy already. they should have less noise so that we can use them to make things better.

It is perfectly fine to add this kind of stuff to reopen benchmark or, even better, dedicated OrdinalMap benchmark. It is also ok to do the same thing with taxonomy facet's huge arrays somewhere. But let's not mix this in with trying to debug faceting performance, as it should not happen there. If it does today, then thats an obvious bug what should be fixed.

mikemccand commented 2 years ago

Another way to put it, benchmarks should be reasonable. Rebuilding ordinal maps for every query is crazy, it doesn't reflect reality. nobody is doing this.

Just to clarify, because I may not have explained my suggestion well enough, I certainly don't think it makes sense to refresh the ordinal map on every faceting request in the benchmarks. That's crazy indeed and doesn't at all represent what users would do! What I meant to suggest is doing a small number of index reopens during the faceting benchmark (maybe two or three) to simulate what users might run into in the wild when periodically hitting reopens. This would "bake in" the currently (not ideal) penalties of dumping the ordinal maps and having to lazily re-init them after reopens. Because we never reopen the index while running the benchmarks, the SSDV benchmark tasks probably appear a bit overly "rosy" relative to what a user might actually experience.

Yeah +1, this would make for a more realistic benchmark. Never refreshing (as the benchmarks now do) is not really common. Maybe open a spinoff luceneutil issue for this?

There is a dedicated "near real time refresh" benchmark, which runs in the nightly benchmarks, and reports refresh latency over time. But it does not recompute any OrdinalMaps today, I think? It does do some concurrent search.

mikemccand commented 2 years ago

I'm also not a fan of introducing reopens into the current benchmarks as it mixes concerns. IMO all the benchmarks are a bit too noisy already. they should have less noise so that we can use them to make things better.

Yeah OK that makes sense.

It is perfectly fine to add this kind of stuff to reopen benchmark or, even better, dedicated OrdinalMap benchmark.

@jpountz already made a nice dedicated OrdinalMap benchmark ... I'll try to expose this in the nightly runs/charts.

Maybe we can add some SSDV faceting to the existing NRT latency benchmark, but make sure it does warming during refresh, not in the query path?

It is also ok to do the same thing with taxonomy facet's huge arrays somewhere. But let's not mix this in with trying to debug faceting performance, as it should not happen there. If it does today, then thats an obvious bug what should be fixed.

+1

gsmiller commented 2 years ago

I'm also not a fan of introducing reopens into the current benchmarks as it mixes concerns. IMO all the benchmarks are a bit too noisy already. they should have less noise so that we can use them to make things better.

That's a good point. I could see how this would add noise when trying to optimize the actual guts of the faceting implementation and measure change. I'm convinced. Thanks!