Merging Query and Fetch phase Intelligently to reduce the overall query latency for Query_And_Fetch SearchType

navneet1v commented 1 year ago

Is your feature request related to a problem? Please describe. While hitting the Search API[with Query_And_Fetch searchType] of OpenSearch, the request lands on coordinator where shards related to the index is identified, and request is routed to those shards(p/r based on load and many other things). Those shards return the docIds back to the coordinator node (Query Phase) and then Fetch phase is run to get other details(like _ids, source etc). This Fetch phase adds another hop and extra latency for the whole query.

So the feature I am proposing here is to short circuit this extra hop based on some conditions(pagination is not present in the query, aggregations is not present, its a cheap query like _source is false etc)

Describe the solution you'd like The solution I am proposing is based on query OpenSearch should add a smart logic whether it needs separate calls for doing query and fetch phase or this can be done in 1 hop.

We already merge the query and fetch phase but the logic for that if there is only 1 shard in the index. (Code ref: https://github.com/opensearch-project/OpenSearch/blob/2065cdb3be1a8676a4d0c846557584c96f7c0892/server/src/main/java/org/opensearch/search/SearchService.java#L602-L604)

which is finally make sure that we hit this condition at coordinator node:

https://github.com/opensearch-project/OpenSearch/blob/56a24bb8534acd4ed200ba57f232cb0aae7e959b/server/src/main/java/org/opensearch/action/search/FetchSearchPhase.java#L138-L152

Describe alternatives you've considered There is no alternative I can think of.

Additional context

This will be very useful for Vector Search cases where most of the time customers don't require _source, are not performing aggregations too.

vamshin commented 1 year ago

This will be a great feature for use cases that retrieve fewer documents example vector search use cases where customer interested in only few neighbors/documents and help save round trip time by merging query and fetch

navneet1v commented 1 year ago

Adding details few more details around possible solutions. This optimization is only for Query_And_Fetch searchType:

Option 1 (User providing Explicit parameter that Optimize Search and Fetch Phase)

User Setting Flag in Query

Pass a url parameter in the _query like this optimize_search_and_fetch_phase=true which will make sure that we always Optimize query and fetch phase.

Exposing Optimization Flag via Index Settings

Users can set an index setting like index.optimizeQueryAndFetchPhase then they don't need to pass this setting every-time with query.

Option 2 (Adding Intelligence in OpenSearch Core)

As provided in the description above, during the query execution we will look at these things:

Is Aggregation present or not in the Query
is _source required or not in the Query .. there can be many more cases, but I can think of these 2 only.

if above condition satisfies then we will go ahead and optimize the Query and FetchPhase.

I will keep on refining the proposal with more details and edge cases. But from high level this is what I am thinking.

jmazanec15 commented 1 year ago

For option 1, what about adding a new search type: query_and_fetch? I think this is more consistent with the current interface than adding a new query flag (ref: https://opensearch.org/docs/latest/api-reference/search/#url-parameters)

For option 2, this seems like it would be difficult to implement this without causing regressions. For instance, cases where the field being retrieved is fairly large and there are a lot of shards. This would result in network and memory bottlenecks.

msfroh commented 1 year ago

I really like this, especially option 1 (and I like @jmazanec15's suggestion of adding a new query type).

I think we could implement option 2, but we would need to be quite conservative, I think. In particular, we would probably also need to set a cap on the number of shards involved. A wacky option (not sure if it's doable) could involve sending a hint during the query phase that essentially says "Hey shard, do the fetch too, but only if the returned payload is 'small' for some definition of small".

navneet1v commented 1 year ago

@jmazanec15 @msfroh, I am not sure about having a search type. I don't know the tenants around creating a new search type. We already have a search type called as QUERY_AND_FETCH(which is default), and this is an optimization for the same search type. Also if you look at DFS_QUERY_FETCH search type the same optimization can be applied there too. Hence I kind of disagreeing to having a new Search Type Approach

Feel free to provide reason why this should be a SearchType.

I think we could implement option 2, but we would need to be quite conservative, I think. In particular, we would probably also need to set a cap on the number of shards involved. A wacky option (not sure if it's doable) could involve sending a hint during the query phase that essentially says "Hey shard, do the fetch too, but only if the returned payload is 'small' for some definition of small".

Yes we have to be very conservative. Yes shards number is a good callout, I was thinking to get away from these magic numbers like shard count etc, because we need to find good default which works well in most of the cases then provide configuration to set those, which in my mind is not an ideal user experience. I was thinking of some more automated ways like we predict the memory for Search Request and throttle if it can cause out of heap exception. I was thinking how would that solution will sound?

gashutos commented 1 year ago

@navneet1v Generally I was benchmarking search while improving sort queries, I was seeing 5-10 ms overhead for 10000 documents in fetch phase. This was for http_logs workload. Would this reduction in fetch phase 5-10 ms trades off worth keeping additinal in-memory to store results from n sgards ?

Some benchmarking would help here to decide trade off.

navneet1v commented 1 year ago

@gashutos As per my understanding of OpenSearch and what this feature optimizes is:

The network latency which is happening because of the additional hop to run the fetch phase. For the specific example you provided I would not use this optimization because it has potential of out of heap exception. But again we need to try this to understand more.

Some benchmarking would help here to decide trade off.

Yes, benchmarking needs to be done to quantify the exact benefit, this is the reason why I created this as a Feature request and not a RFC with all the details. I was trying to gather feedback around if this is even a good idea if yes what are possible pitfalls/cases need to be looked at while developing the feature.

vamshin commented 1 year ago

I feel we would open up regressions with Option2 as called by @gashutos if we make this change in default logic. I feel Option1 can be safer, where we introduce setting/flag/new phase to explicitly turn on by customer after evaluating their use case similar to DFS_QUERY_FETCH

navneet1v commented 1 year ago

I feel we would open up regressions with Option2 as called by @gashutos if we make this change in default logic. I feel Option1 can be safer, where we introduce setting/flag/new phase to explicitly turn on by customer after evaluating their use case similar to DFS_QUERY_FETCH

Yes thats a fair point and I agree on that. We can start with a query param. But in long term we should try to optimize the search as much as possible. Here I added the option2 with different condition to make sure that if there is some possibility to avoid the query param we should take advantage of that. But more I think over this more it feels to be harder to come up with such a sophisticated logic.

In long term we should try to move towards a better logic rather than relying on query params as that will be more natural addition in the search path.

jmazanec15 commented 1 year ago

Feel free to provide reason why this should be a SearchType.

@navneet1v There are 2 existing search (edit) types:

query_then_fetch
dfs_query_then_fetch

So, I think it would be consistent to have:

query_and_fetch
dfs_query_and_fetch

In this case, I think "and" when compared to the existing type of "then" is a good description of what is happening.

In long term we should try to move towards a better logic rather than relying on query params as that will be more natural addition in the search path.

I agree with this. I think its something that we could evolve to over time.

"Hey shard, do the fetch too, but only if the returned payload is 'small' for some definition of small".

@msfroh I like the idea of falling back to query_then_fetch based on some kind of criteria (such as memory). We could also try to minimize overhead by only fetching the top results' srcs from each shard. For instance, given a search of size 100 over 100 shards, have each shard only fetch for the top 10 docs (100 srcs vs 1000). Then, if it turns out the final results' srcs aren't all there, fallback to fetch.

navneet1v commented 1 year ago

In this case, I think "and" when compared to the existing type of "then" is a good description of what is happening.

Yeah I am not sure if that is a good thing to do. But this is an option we should consider. I would like to get maintainers of OpenSearch core to provide more guidance on how they think of search type should evolve and this feature.

So we have 3 options now: Short Term:

Creating a new query param
Creating new search type called as query_and_fetch

Long Term:

Move towards a better logic to naturally add this feature in the Search path.

cc: @nknize , @andrross , @dblock

dblock commented 1 year ago

This feels like a subset of "query planning". In some particular case, instead of making X order of operations we do Y order of operations. So a query string parameter feels like the most awkward and short term to me that we'd have to maintain forever as a new API. If the caller knows that query_and_fetch can be applied, feels like OpenSearch itself should too, even if it's not easy. The figuring out part itself should be behind a feature flag, but I think it would be best to build the decision into the engine to decide how to query plan (skip fetch).

navneet1v commented 1 year ago

@dblock I totally agree that this should be part of Query planning. The reason options of query param was suggested so that we can start seeing the benefit before we can add it in the core Query engine.

Feature flag is another way to ship the feature. @dblock if I am not wrong is your suggestion is to build this in query engine behind a feature flag and keep evolving the logic till we are confident enough to release as GA?

kkmr commented 1 year ago

A bit late to the party. I like Option 1 (setting a config with a default that customers can override). Not a fan of adding a new query type. We could eventually build the smarts to infer this from customer data.

navneet1v commented 1 year ago

A bit late to the party. I like Option 1 (setting a config with a default that customers can override). Not a fan of adding a new query type. We could eventually build the smarts to infer this from customer data.

Thanks for providing the input here @kkmr . I am currently in POC building phase to figure out the changes required to have this feature. But the condition to merge the phase is still open. Once I have the POC done, I will add a formal RFC to consider with all the suggestions.

andrross commented 1 year ago

I agree with @dblock's comments. The basic question I would ask is: Does the user have any information that the system does not that allows them to make a better decision about whether to combine phases? I think the answer is "no", which means that OpenSearch should be automating that decision itself. Any solution that requires the user to opt in (whether a query parameter, search type, or index setting) feels like a short term work-around to the fact that we haven't implemented the (albeit very difficult and complicated) logic to make the decision automatically without introducing regressions. A feature flag is a good option for such a setting that is intended to be temporary but allows users to opt in with knowledge that there may be rough edges. There may be other ways to do it too, but the key point is that we probably don't want to define a new long-term API for this.

navneet1v commented 1 year ago

@andrross Thanks for providing the input. Few clarification:

we probably don't want to define a new long-term API for this.

From start no net new apis will be defined. I am merely pigging back on a logic which already exist in OpenSearch for the queries. Its just that I want to improve it.

Does the user have any information that the system does not that allows them to make a better decision about whether to combine phases? I think the answer is "no", which means that OpenSearch should be automating that decision itself.

This is really a good point and a good dimension that you brought in.

A feature flag is a good option for such a setting that is intended to be temporary but allows users to opt in with knowledge that there may be rough edges.

Feature flag is great option. But one thing I am bit confused here is what is your recommendation is it like make the system intelligent from day 1 and put that intelligence behind the feature flag?

Any solution that requires the user to opt in (whether a query parameter, search type, or index setting) feels like a short term work-around

The way I think about these as an iterative steps to improve the feature and not short term workarounds. Having said that, even these iterative items will be behind the feature flag.

Also, having query params for this feature is not a new thing, in OpenSearch _search API, there are various params/settings that can be used example: query_cache, request_cache etc. For all of them we have intelligent logic in OpenSearch along with that we have user provided signals in term of query params. Hence I don't see providing query params/index settings is hack.

msfroh commented 1 year ago

If we want to make this kind of decision intelligent, I'm trying to think about what the coordinator knows versus what the shard knows. The coordinator knows how many shards there are in total (which determines the total "budget" for query phase responses). The shard knows how many hits it collected during the query phase for the given query.

I think it makes sense to set some cap on query phase response size, like 1MB or something, and divide it by the number of shards hit by the query. If your query fans out across 10 shards, each shard gets 100kB of "fetch budget" for the query phase. Before returning query results, each shard should go down through their collected TopDocs in order, fetching until their budget is hit (since the earlier top docs are more likely to be part of the final response anyway). All shards return up to 100kB of doc data as part of the query phase. The coordinator collates TopDocs to get the "true" top N. If those were already fetched as part of the query phase, we're done -- return the SearchResponse. Otherwise, only fetch the additional missing docs.

That way, we "always" do some useful fetching during the query phase.

As a bonus, you almost wouldn't need a feature flag -- the initial default value for the fetch budget could be zero. If someone wants to try the feature they could allocate some fetch budget on a per-request basis or in a cluster setting.

andrross commented 1 year ago

From start no net new apis will be defined

I'm using "API" quite loosely here. Any endpoint, parameter, or configuration setting is logically a part of the API and must be supported as such.

"fetch budget"

@msfroh This idea makes a lot of sense to me! Even if we had a binary "skip fetch phase" option, we'd still have to limit the query phase response size. And if you hit that limit you'd either have to fail the query or fallback to doing fetch phase. Assuming fallback to doing the fetch phase is the better experience, and assuming you'd need to make the limit configurable somehow, then I've just talked myself into implementing exactly what you described!

navneet1v commented 1 year ago

@andrross The reason for suggesting the query param approach as an experimental flag is to iterate faster on the solution and get some improved latency numbers.

I did a POC implementation here: https://github.com/navneet1v/OpenSearch/commit/a36a2940b44fb1532bd5985abb29aba558bdcb0b with the query param approach and validated locally. As a next step I was thinking to run some benchmarks.

I want to know what benchmarks I should run here, is there any standard benchmarks we prefer with multi-node cluster and multiple shards? Or should I use micro-benchmarks?

On the other hand what should be the long term way to make this query and fetch optimization I feel we have many great ideas and not sure what will be the best. I want to understand 1 thing from the group is what should be the next steps here, one obvious I can see is benchmark with POC code to get initial numbers.

@msfroh , @dblock , @andrross any suggestion here how I should proceed this further

msfroh commented 1 year ago

I want to know what benchmarks I should run here, is there any standard benchmarks we prefer with multi-node cluster and multiple shards? Or should I use micro-benchmarks?

AFAIK, the nightly OSB jobs use multiple nodes. We might be able to test with e.g. nyc_taxis. I'm not sure how to get up and running with that, though. I did once publish a build as a release on my GitHub fork and asked @rishabh6788 for help to kick off a benchmark run.

On the other hand, it feels like a microbenchmark would give us some decent preliminary numbers to show the difference between a single phase query+fetch versus the two-phase baseline. I think a setup with a single coordinator node and a single data node would be enough.

rishabh6788 commented 1 year ago

The nightly benchmarks will pick up the change once it is merged on the main/2.x and built successfully as part of our nightly distribution build job.

andrross commented 1 year ago

@rishabh6788 Do we have any mechanism for running the benchmarks on proof-of-concept code on a feature branch or user fork? It's obviously possible to set up the test infrastructure manually and run tests, but is there any way to hook into the existing infrastructure to do this?

rishabh6788 commented 1 year ago

@andrross The existing workflow is tied to the manifest that is generated by each distribution build job building artifacts for main and 2.x branch. For now if you want to test your local changes it has be manual runs using opensearch-cluster-cdk and opensearch-benchmark.

I will take this as a feature request to enhance our current benchmark workflow to allow running ad-hoc benchmarks without requiring a manifest file and accept artifact tarball location directly. Will update once it is ready.

andrross commented 1 year ago

I will take this as a feature request to enhance our current benchmark workflow to allow running ad-hoc benchmarks without requiring a manifest file and accept artifact tarball location directly.

That's awesome @rishabh6788, thank you! I think this will be super useful by making it easier to benchmark prototypes and wild ideas to get data as soon as possible.

reta commented 1 year ago

@msfroh This idea makes a lot of sense to me! Even if we had a binary "skip fetch phase" option, we'd still have to limit the query phase response size. And if you hit that limit you'd either have to fail the query or fallback to doing fetch phase

Certainly +1 to @msfroh idea, the "fetch budget" is quite a clever technique to make a decision on shard level

opensearch-project / OpenSearch