Ignore skipped shards for query_and_fetch optimization

jtibshirani commented 3 years ago

Usually the search coordinator executes the query phase on the data nodes, then determines top hits, then executes the fetch phase. We optimize this when a search request only targets one index shard: the data node performs the query and fetch phases at the same time to avoid a roundtrip.

Currently this optimization is only applied if the original search request targets one shard. We could also apply it when only one shard passes the can_match phase and all others were skipped.

elasticmachine commented 3 years ago

Pinging @elastic/es-search (Team:Search)

ywelsch commented 3 years ago

I've had a look at what it would take to implement this and wanted to share my thoughts here.

The numberOfShards field on ShardSearchRequest (which includes skipped shards) is currently used to determine whether or not the fetch phase should be inlined. This logic is spread across sender (coordinator) and receiver (data node) and relies on both nodes agreeing on this common logic, which makes it difficult to evolve the logic. It becomes even more difficult when proxying requests (as you would have to account for old to new to old transitions, or new to old to new transitions).

To allow evolving the logic we could try to avoid coupling the shape of the request to the response format, and either

allow the queried node to make the decision (to inline fetch or not) on its own, returning whatever it deems best, or
allow the querying node to make the decision (to inline fetch or not), and force it onto the queried node (hello, QUERY_AND_FETCH).

The challenge with the first variant is that the coordinator would then have to be able to deserialize whatever it gets as response (union type), and this would also have to hold for every proxy node (in particular account for the fact that the response might be passed through an older version node). An implementation could have the querying node pass on whether there were skipped shards or not (and how many). If successfully passed on, it will allow the receiving node to inline fetching when every hop on the path to the data node has supported passing on the skipped shards information. The main challenge is that the coordinator will need to be able to deserialize whatever response it receives back (in particular it needs to account for the fact that it might not have received an inlined fetch result even if from its own perspective that should have been possible) and that newer data nodes still need to make sure to send compatible responses to old nodes (converting union type into concrete type based on numberOfShards specified in request).

The challenge with the second variant is that the coordinator then needs to make sure that the queried node is honouring its request for optimization. An implementation could achieve this, when sending a ShardSearchRequest from a newer to an older node, and the newer node seeing potential for inlining the fetch phase (e.g. because non-skipped shards = 1), by encoding numberOfShards as numberOfNonSkippedShards. The main drawback is that the old node will render numberOfNonSkippedShards as numberOfShards in some places (could only find "elasticsearch.slowlog.total_shards" in slowlog on master, but there might be more uses in older versions). The advantage of this variant is that it does not require reading union type response formats, and that it also enables the optimization when only the coordinator is on a newer version.

jtibshirani commented 3 years ago

I also like the second approach, since the behavior is consistent + simple: when a 'new' node is acting as coordinator, the search will always query_and_fetch if there is one non-skipped shard.

The one concern I see is that we also have a parameter ShardSearchRequest#shardRequestIndex, representing the index of the shard when the coordinating node constructs the initial shards iterator (which includes skipped shards). Our slicing logic in SliceBuilder uses this plus ShardSearchRequest#numberOfShards to determine whether a slice belongs to a particular shard. If we change the meaning of numberOfShards, this slicing logic will be incorrect. This also made me notice the current behavior is a bit surprising! In the presence of skipped shards, our slicing might be very imbalanced.

ywelsch commented 3 years ago

Great observations!

I haven't seen slicing being used in cases you would expect shards to be skipped. Typically slicing is used when reindexing full indices (i.e. no other filters), so while the current logic might lead to imbalanced slicing strategies, I'm not sure if that ever comes into play in practice.

The slicing logic BWC looks to be a real dealbreaker for approach 2, unfortunately. Will keep thinking about this.

elasticsearchmachine commented 3 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)

elastic / elasticsearch

Ignore skipped shards for query_and_fetch optimization #71990