garipovazamat commented 1 month ago

We have been trying to migrate from Elasticsearch version 1.7.6 to the latest version (8.15) in our company and discovered that the latest version has become much slower. To find the reason for this degradation, I conducted several experiments. During these experiments, I found that some claimed improvements likely do not work as expected.

Experiment details

I created the following index mapping:

{  
    "properties": {  
        "props": {  
            "properties": {  
                "entity_obj": {  
                    "properties": {  
                        "category": {"type": "keyword"},  
                        "id": {"type": "integer"},  
                        "priceTotal": {"type": "integer"},  
                        "totalArea": {"type": "double"}  
                    }  
                },  
                "price": {"type": "long"},  
                "room": {"type": "short", "store": True}  
            }  
        },  
        "query": {"type": "percolator"}  
    }  
}

I filled index with 10 000 duplicated queries, which contain only must, should, term and range conditions

{  
    "query": {  
        "bool": {  
            "must": [  
                {"term": {"props.entity_obj.category": "flat1"}},  # first, simple condition
                {  # second, more comlicated condition
                    "bool": {  
                        "must": [  
                            {"bool": {  
                                "should": [  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  # the more such conditions, the longer the percolation
                                    {"range": {"props.price": {"gte": 1000}}},  
                                ]  
                            }},  
                        ]  
                    }  
                }  
            ]  
        }  
    }  
}

You can see, that there are two main conditions inside must: the first is simple, and the second a bit more complex. Logically, there is no reason to check the second condition if the first one is false. However, my experiments showed that if the first condition is false for a document, adding conditions inside should (the second condition) increases the percolation time. Therefore, I conclude that the improvements claimed in this article https://www.elastic.co/blog/elasticsearch-percolator-continues-to-evolve do not work.

Also the percolator will no longer load the percolator queries as Lucene queries into memory as they are instead read from disk. Pre 5.0 if you had thousands of percolator queries they’d take up megabytes of precious JVM heap space, putting pressure on jvm garbage collecting and if not being careful lead to an infamous jvm out of memory error. Back then loading the percolator queries into memory made sense because all the percolator queries were evaluated all the time so we made executing each one as fast as possible. Now with pre-selecting, only percolator queries that are likely to match. We decided to trade speed for stability, removing the caching to free up memory. The speed loss is more than paid for by skipping most queries in most cases.

I ran the percolation with the following request:

{  
  "constant_score": {  
    "filter": {  
      "percolate": {  
        "field": "query",  
        "document": {  
          "props": {  
            "entity_obj": {  
              "category": ["flat2"],  
              "id": 1,  
              "priceTotal": 10001,  
              "totalArea": 100  
            },  
            "price": 10001,  
            "room": 1  
          }  
        }  
      }  
    }  
  }

As a result, I got the following percolation time with one document: ~0.157 seconds. I conducted a similar experiment on Elasticsearch version 1.7.6 with identical data, and the result was: ~0.008 seconds, which is ! ~20x faster.

We also tried percolating with real production data. The only improvement we saw was when we added additional filters with the percolate query by using metadata, which we extracted from the primary query. For example, we took the query mentioned above and added metadata (meta_data.category field).

{  
  "query": {  
    "bool": {  
      "must": [  
        {"term": {"props.entity_obj.category": "flat1"}},  
        {  
          "bool": {  
            "must": [  
              {  
                "bool": {  
                  "should": [  
                    {"term": {"props.entity_obj.category": "flat2"}},  
                    {"range": {"props.price": {"gte": 1000}}}  
                  ]  
                }  
              }  
            ]  
          }  
        }  
      ]  
    }  
  },  
  "meta_data": {  
    "category": "flat1"  
  }  
}

Then I sent the following request:

{    
  "constant_score": {    
    "filter": {    
      "bool": {    
        "must": [  
          {  # additional filter
            "bool": {  
              "should": [  
                {"term": {"meta_data.category": "flat1"}},  
                {"bool": {"must_not":  {"exists": {"field": "meta_data.category"}}}}  # condition for cases, when query has no filter by category
              ]  
            }  
          },  
          {"percolate": {"field": "query", "document": {# our document} }}    
          }
        ]    
      }    
    }    
  }    
}

But this approach has a disadvantage. It becomes more difficult to percolate a large batch. If I need to percolate many documents, I have to separate them by the category field, resulting in smaller batches. This negates the improvement of percolating many documents in one query. I also tried using named percolation (the name field in the percolate query) and made a query with a few percolate queries inside (one for each category), but this approach did not have any advantage compared to separate requests (the percolation time was the same). In general, extracting metadata and adding additional filters for this metadata seems like unnecessary work, forcing us to maintain those extra filters. It seems that the search engine should handle such optimizations itself. I suspect this is the "pre-selecting" feature.

Python scripts for experiments (python 3.12): scripts

Conclusion

Currently, percolation with queries, even simple filters, performs significantly slower than in older versions of Elasticsearch. It seems likely that the latest version lacks the pre-selecting optimization, or it is not functioning correctly. Alternatively, I might have missed something, and it can be enabled. Either way, we cannot migrate to the current version of Elasticsearch due to this performance issue. I would appreciate any help you can provide to resolve this problem.

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)

john-wagster commented 3 weeks ago

@garipovazamat first off let me just say wow congrats on working through upgrading to 8.x. I can still remember when I used 1.7.6.

I spent a bit of time digging in here to try to give you a better understanding of some of the underlying mechanics and options. And I plan to follow up with potential improvements where it makes sense.

Most of this discussion for now will come back to query formation. Because of how we are storing and retrieving the queries for percolators you do have to be a little more thoughtful with the benefit that the ES cluster is a lot more stable (and just in general you will likely notice stability improvements migrating from 1.7 to 8.15 across the board for instance when experimenting for this issue on 1.7.6 I had seg faults multiple times but with 8.x I haven't had any).

In my own tests I'm not seeing and don't expect to see improvements over some of the reported metrics you are getting particularly in the case you provided on 8.x. It's easy to see how you might misunderstand that your query can be optimized from the article you mentioned though. It is not a query optimization engine and so query optimization at least in this case falls to you right now. The optimizations are gained simply by using Lucene to optimize what's evaluated and short circuit where (ideally) possible, but it does take some knowledge of how that evaluation occurs and there are several limitations. The most intuitive understanding I can provide is that a simpler query is constructed and kept in memory for short circuiting but there are heuristics used to determine those covering queries. And while query optimization might seem like the easy expected outcome of this issue it might cause non-intuitive outcomes and so needs to be carefully considered. However, in 1.x the percolator queries were all entirely stored in memory and linearly evaluated. So as you add more queries the evaluation time goes up no matter whether the queries would have been short circuited or not (this may be counter to what you found but I'll show you the data I collected and we can discuss).

There's likely improvements here we can explore but I want to take that back for further discussion. Minimally I think the documentation might benefit from some more information about how these optimizations work, because though it is referenced to some extent here: query-dsl-percolate-query.html#how-it-works. I think it's pretty subtle, specifically the references to indexed term queries.

So then the problem is just how can we optimize these queries. There's a few options and let me show you some data around those for my own evaluations and then I'd be keen to see if any of these work for you.

Experiments

Setup

used the same mapping you referenced in the issue
used the original query unless otherwise specified

Exp 1: 1.7.6 - baseline

1000 percolator queries were used
- only 1000 because for 1.7.6 I had a hard time getting it stable enough to handle more than 1000 percolator queries on a single shard
timing always has some overhead because all queries have to be evaluated; no short circuiting occurs
- any overhead between the two is likely because of serialization of output; in general timings fluctuate a lot more here
test flat1 queries (ms for each repeated query)
- 185, 74, 184, 149, 62, 226, 187, 28, 136, 145, 122, 23, 93, 113, 22, 90
- avg: ~114ms
test flat2 queries (ms for each repeated query)
- 121, 21, 24, 94, 117, 34, 133, 128, 16, 73, 112, 36, 66, 130, 15, 21
- avg: ~71ms

Exp 2: 8.15 - orig percolator query

1000
timing always has some overhead because all queries have to be evaluated; no shortcircuiting occurs
- timings are much more consistent but there is still overhead here (agree it should be sub 10 ms)
- in reviewing the code this is likely because a combination of interactions between should and range but most important should which is evaluated from the bottom up cascading leaf nodes
I would expect the timings here to be the same
test flat1 queries (ms for each repeated query)
- 36, 39, 32, 35, 36, 34, 34, 33, 34, 35, 33, 31, 34, 34, 32, 32
- avg: ~34ms
test flat2 queries (ms for each repeated query)
- 41, 34, 90, 32, 28, 31, 29, 28, 29, 27, 28, 28, 29, 28, 25, 28
- avg: ~33.8ms

Exp 3: 8.15 - orig percolator query

10_000
same outcome and explanation as Exp 2 just at the 10k scale to get more interesting timings
test flat1 queries (ms for each repeated query)
- 172, 147, 155, 174, 154, 155, 150, 175, 166, 157, 149, 156, 185, 154, 147, 151
- avg: ~159ms
test flat2 queries (ms for each repeated query)
- 176, 143, 133, 143, 149, 142, 143, 139, 127, 141, 139, 136, 131, 138, 133, 141
- avg: ~140ms

Exp 4: 8.15 - optimized percolator queries (must_not or filter or both!)

10_000
see notes below for query
timing should showcase short circuiting occurs
slightly slower because more conditions are evaluated
test flat1 queries (ms for each repeated query)
- 4, 4, 5, 5, 4, 3, 3, 4, 3, 3, 3, 5, 3, 4, 4, 3, 4, 3, 3
- avg: ~3.68ms
test flat2 queries (ms for each repeated query)
- 329, 158, 201, 162, 165, 160, 181, 165, 186, 158, 177, 203, 158, 171, 199, 166
- avg: ~183ms

Exp 5: 8.15 - optimized percolator queries (break up queries)

10_000 (x2)
broke apart should clause into multiple percolator queries
timing should showcase short circuiting occurs
overall timing is much faster when query matches because maximum parallelization is possible
test flat1 queries (ms for each repeated query)
- 57, 62, 58, 93, 75, 81, 95, 94, 102, 93, 94, 98, 73, 78, 93, 92
- avg: ~83ms
test flat2 queries (ms for each repeated query)
- 3, 3, 4, 3, 2, 3, 4, 4, 4, 3, 4, 9, 7, 2, 2, 4, 3, 2, 4, 4
- avg: ~3.68ms

Exp 4 query

{  
    "query": {  
        "bool": {  
           "filter": {
                "terms": {
                    "props.entity_obj.category": ["flat1", "flat3"]
                }
            },
            "should": [
                {"bool": {
                    "must": [
                        {"term": {"props.entity_obj.category": "flat1"}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}}
                    ]
                }},
                {"bool": {
                    "must": [
                        {"term": {"props.entity_obj.category": "flat3"}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}}
                    ]
                }},
                {"bool": {
                    "must": [
                        {"term": {"props.entity_obj.category": "flat1"}},
                        {"term": {"props.entity_obj.category": "flat2"}},                        
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}},
                        {"range": {"props.price": {"gte": 1000}}}
                    ]
                }}                
            ]
        }  
    }  
}

Exp 5 query

query 1:

{  
    "query": {  
        "bool": {
            "must": [
                {"term": {"props.entity_obj.category": "flat1"}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}}
            ]
        }
    }  
}

query 2:

{  
    "query": {  
        "bool": {
            "must": [
                {"term": {"props.entity_obj.category": "flat3"}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}},
                {"range": {"props.price": {"gte": 1000}}}
            ]
        }
    }  
}

I took away several conclusions from these experiments.

For my experiments 1.7.6 was slower overall than 8.15, which I realize is in conflict with what you've reported. So could be worth looking into your cluster configurations for both to see if they line up appropriately. 8ms timings for 1.7.6 surprise me a bit knowing how the code is implemented. In my experiments I saw some overhead no matter what document was passed for evaluation in 1.7.6, which lines up with how each percolator query has to be fully evaluated against a document that's passed to ES.

When using 8.15 (or any distributed database) I would suggest trying to pre-join your percolator queries prior to indexing them; it's essentially the same concept for documents when inserting them into ES (or most distributed stores) in that we want to denormalize the inserted data to prescribe optimizations to ES.

There's also some pain here with intuiting how queries will perform specifically with heavily nested boolean queries with should clauses and range. The should clauses are preventing the optimizations you are expecting so if we can rewrite the queries to compensate for this you should see query performance improvements. I took two approaches to this to see if that would work for you and also as a bootstrap for additional internal discussions about percolator queries.

Both Exp 4 and Exp 5 are examples of taking advantage of the term query optimization (mentioned introduced in 5.x) and more importantly avoiding some pitfalls of how should is handled under the hood.

Specifically Exp 4 tries to keep all of the query conditions from your examples including after joining some conditions that don't make sense like multiple categories. Because of the should condition this by itself is not sufficient and requires a filter condition. Really introducing a filter or a must_not at the top level will generally have the effect you are looking for without introducing additional metadata outside of the percolator query particularly if those top level filter conditions are term queries, which are going to be extremely fast and always part of the query short circuiting path.

Exp 5 takes this a step further and attempts to index more than the original 10k percolator queries but simplifies each individual percolator query effectively removing the problematic should and more importantly bypassing the need to introduce a filter clause. I found in my experiments that this was overall the best route but I didn't test with large numbers of percolator queries or real production data so your mileage may vary here vs the approach in Exp 4.

Adding @martijnvg as well as he might be able to add more color or correct me on anything here as I believe he did much of the work on percolators in 5.x and since.

garipovazamat commented 1 week ago

@john-wagster Thank you for your response and for taking the time to look at my experiments!

The example I provided above was just an attempt to understand if there is a pre-select optimization. Our production case is more complex, and I didn't want to overload the issue by attaching it right away. In our case, we cannot simply set one filter in the filters block. I have attached the data more similar to real data in production below, but I don't understand how improvements you suggested can be applied for this case. I assumed that the optimization is implemented in a similar way to what is described in this article https://martin.kleppmann.com/2015/04/13/real-time-full-text-search-luwak-samza.html, since Luwak is now part of Lucene, and Elasticsearch is based on Lucene.

I also double-checked the percolator for ES1, and it indeed performs very quickly and correctly returns the percolation results. The percolation time for flat1 is ~30ms (when searches match the document), and for flat2, it is ~3ms (when searches do not match).

I tested it locally through a Docker container. I ran it like this: docker run -d --name elastic-custom -p 9200:9200 -p 9300:9300 --rm elasticsearch:1.7.6). I don't understand why it is so slow for you. I had about ~8GB free RAM at the time of launch (a total of 16GB, DDR4) and an Intel Core i3 8350k processor, running on Ubuntu 22.04.

ES8 I ran also by docker, command: docker run -d --name elastic-custom -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "xpack.security.enabled=false" -e "ES_JAVA_OPTS=-Xms1G -Xmx1G -XX:ActiveProcessorCount=4" -e "bootstrap.memory_lock=true" --rm elasticsearch:8.15.2

Experiment on data similar to production

For clarity, I conducted another experiment. I compared the percolation time of ES1 and ES8 on data that is more similar to production data. I provide data for the experiment that is more similar to real production data.

Data for ES8

Mapping for ES8

``` { "mappings": { "properties": { "geo": {"properties": {"coordinates": {"type": "geo_point"}}}, "is_multi": {"type": "boolean"}, "props": { "properties": { "type": {"type": "keyword"}, "is_object_hidden": {"type": "boolean"}, "is_closed_visibility": {"type": "boolean"}, "room": {"type": "short", "store": true}, "search_total_price_rur": {"type": "long"}, "entity_obj": { "properties": { "isInHiddenBase": {"type": "boolean"}, "flatType": {"type": "keyword"}, "totalArea": {"type": "double"}, "platform": {"properties": {"type": {"type": "keyword"}}} } } } }, "query": {"type": "percolator"} } }, "settings": { "index": { "mapping": {"total_fields": {"limit": "2000"}}, "refresh_interval": "5s" } } } ```

The query I am testing, which I duplicate 10,000 times in ES8

``` { "query": { "bool": { "filter": [ { "geo_bounding_box": { "geo.coordinates": { "top_left": [54.0420428072, 55.4102422916], # coordinates when object is not in bbox "bottom_right": [54.452561034, 55.0463168202] # "top_left": [55.0420428072, 55.4102422916], # coordinates when object enters bbox # "bottom_right": [50.452561034, 50.0463168202] }, "validation_method": "COERCE" } }, {"term": {"is_multi": false}} ], "must": { "bool": { "filter": { "bool": { "must": [ {"term": {"props.type": 1}}, { "bool": { "should": [ {"term": {"props.is_object_hidden": false}}, {"bool": {"must_not": {"exists": {"field": "props.is_object_hidden"}}}} ] } }, { "bool": { "should": [ {"term": {"props.entity_obj.isInHiddenBase": false}}, { "bool": { "must_not": { "exists": {"field": "props.entity_obj.isInHiddenBase"}} } } ] } }, { "bool": { "should": [ {"term": {"props.is_closed_visibility": false}}, {"bool": {"must_not": {"exists": {"field": "props.is_closed_visibility"}}}} ] } }, { "bool": { "should": [ {"terms": {"props.room": [1, 2, 3, 4]}}, { "bool": { "must": [ {"term": {"props.room": 8}}, {"term": {"props.entity_obj.flatType": "rooms"}}, ] } } ] } }, {"range": {"props.search_total_price_rur": {"lte": 6750000}}}, {"range": {"props.entity_obj.totalArea": {"gte": 20}}} ], "must_not": [{"term": {"props.entity_obj.platform.type": "qaAutotests"}}] } } } } } } } ```

The document I am percolating (the same for ES1 and ES8)

``` { "is_multi": false, "props": [ { "type": 1, "search_total_price_rur": 5750000, "is_object_hidden": false, "is_closed_visibility": false, "entity_obj": { "flatType": "rooms", "totalArea": "52.5", "isInHiddenBase": false, "platform": { "type": "webSite" } }, "room": 2 } ], "geo": { "coordinates": { "lat": 53.207297, "lon": 50.125319 } } } ```

Data for ES1

Here are the settings for ES1:

settings for ES1 cluster

``` { "percolator_simple": { "settings": { "index": { "routing": { "allocation": { "include": { "_tier_preference": "data_content" } } }, "refresh_interval": "5s", "number_of_shards": "1", "provided_name": "percolator_simple", } } } } ```

Mapping for ES1

``` { "mappings": { "some_type": { "properties": { "geo": {"properties": {"coordinates": {"type": "geo_point", "geohash": true, "geohash_prefix": true, "geohash_precision": 9}}}, "is_multi": {"type": "boolean"}, "props": { "properties": { "type": {"type": "string", "index": "not_analyzed"}, "is_object_hidden": {"type": "boolean"}, "is_closed_visibility": {"type": "boolean"}, "room": {"type": "short", "store": true}, "search_total_price_rur": {"type": "long"}, "entity_obj": { "properties": { "isInHiddenBase": {"type": "boolean"}, "flatType": {"type": "string", "index": "not_analyzed"}, "totalArea": {"type": "double"}, "platform": {"properties": {"type": {"type": "string", "index": "not_analyzed"}}} } } } }, } }, ".percolator": { "_id": {"index": "not_analyzed"}, "properties": { "query": {"type": "object", "enabled": false} } } } } ```

The query I am testing, which I duplicate 10,000 times in ES1

``` { "query": { "filtered": { "filter": { "and": [ { "geo_bounding_box": { "geo.coordinates": { "top_left": [54.0420428072, 55.4102422916], # coordinates when object is not in bbox "bottom_right": [54.452561034, 55.0463168202] # "top_left": [55.0420428072, 55.4102422916], # coordinates when object enters bbox # "bottom_right": [50.452561034, 50.0463168202] } } }, {"term": {"is_multi": false}} ] }, "query": { "filtered": { "filter": { "bool": { "must_not": [{"term": {"props.entity_obj.platform.type": "qaAutotests"}}], "must": [ {"range": {"props.search_total_price_rur": {"lte": 6750000}}}, {"range": {"props.entity_obj.totalArea": {"gte": 20}}}, { "bool": { "should": [ {"terms": {"execution": "bool", "props.room": [1, 2, 3, 4]}}, { "bool": { "must": [ {"term": {"props.room": 8}}, {"term": {"props.entity_obj.flatType": "rooms"}} ] } } ] } }, { "bool": { "should": [ {"term": {"props.entity_obj.isInHiddenBase": false}}, {"bool": {"must_not": {"exists": {"field": "props.entity_obj.isInHiddenBase"}}}} ] } }, { "bool": { "should": [ {"missing": {"field": "props.is_closed_visibility"}}, {"term": {"props.is_closed_visibility": false}} ] } } ] } } } } } } } ```

Results

I duplicated query 10 000 times and percolate the same document 9 times in both elasticsearch versions.

Results for ES8 Results in millisecond when the doc match all searches:

646.0, 544.0, 540.0, 545.0, 553.0, 582.0, 584.0, 584., 578.0
mean: 572.8888888888889

Results in millisecond when the doc do not fall within the bbox:

470.0, 475.0, 481.0, 465.0, 480.0, 462.0, 461.0, 463.0, 461.0
mean: 468.6666666666667

Results for ES1 Results in ms when the doc match all searches:

46.0, 62.0, 34.0, 29.0, 25.0, 28.0, 31.0, 23.0, 23.0
mean: 33.44444444444444

Results in ms when the doc do not fall within the bbox:

29.0, 38.0, 31.0, 19.0, 20.0, 18.0, 15.0, 17.0, 17.0
mean: 22.666666666666668

In these experiments, I observe the same difference; ES1 is approximately 17 times faster (for cases when searches match) and 21 times faster (when searches do not match) on the same searches. Additionally, I cannot understand how I could apply the optimizations you suggested to this data. Simply moving all filters to the filter block does not yield any improvements either. Perhaps the issue is not with the pre-select optimization at all, but rather a more general problem.

In production, I see a similar picture. With approximately the same cluster cost, the percolation speed in ES8 is about 10 to 15 times lower. I cannot find the exact reasons for this downgrade. The cluster settings for ES8 in production are as follows:

{  
  "settings": {  
    "index": {  
      "routing": {"allocation": {"include": {"_tier_preference": "data_content"}}},  
      "refresh_interval": "5s",  
      "number_of_shards": "6",  
      "number_of_replicas": "1"
    }  
  }  
}

There are no indexing errors for the queries. The query {"query": {"term" : {"query.extraction_result" : "failed"}}} returns an empty response. Profiling percolate queries does not provide any valuable information. The queries in ES8 and ES1 are analogous, and the data volume is the same (~5 million searches in the percolator and ~2 documents per second are percolated).

I also want to clarify, what you mean by "pre-join queries of the percolator"?

Perhaps the optimizations you suggested can be applied to my queries, but it is not obvious to me how they work. Maybe you can tell me where I can read about them? If they are not applicable, I would like to understand if you can do something about it?

benwtrent commented 1 week ago

@garipovazamat could you try specifically setting minimum_should_match in the boolean clauses where should clauses exist? In filter contexts, we no longer set the minimum_should_match to 1, instead those should clauses are dropped.

Percolator is still executing a pre-filter and optimization check. But, we might be dropping those should clauses on the floor.

garipovazamat commented 1 week ago

@benwtrent Do you mean try this query?

query with minimum_should_match setting

``` { "query": { "bool": { "filter": [ { "geo_bounding_box": { "geo.coordinates": { "top_left": [55.0420428072, 55.4102422916], "bottom_right": [50.452561034, 50.0463168202] }, "validation_method": "COERCE" } }, {"term": {"is_multi": False}} ], "must": { "bool": { "filter": { "bool": { "must": [ {"term": {"props.type": 1}}, { "bool": { "minimum_should_match": 1, "should": [ {"term": {"props.is_object_hidden": False}}, {"bool": {"must_not": {"exists": {"field": "props.is_object_hidden"}}}} ] } }, { "bool": { "minimum_should_match": 1, "should": [ {"term": {"props.entity_obj.isInHiddenBase": False}}, { "bool": { "must_not": { "exists": {"field": "props.entity_obj.isInHiddenBase"}} } } ] } }, { "bool": { "minimum_should_match": 1, "should": [ {"term": {"props.is_closed_visibility": False}}, {"bool": {"must_not": {"exists": {"field": "props.is_closed_visibility"}}}} ] } }, { "bool": { "minimum_should_match": 1, "should": [ {"terms": {"props.room": [1, 2, 3, 4]}}, { "bool": { "must": [ {"term": {"props.room": 8}}, {"term": {"props.entity_obj.flatType": "rooms"}}, ] } } ] } }, {"range": {"props.search_total_price_rur": {"lte": 6750000}}}, {"range": {"props.entity_obj.totalArea": {"gte": 20}}}, ], "must_not": [{"term": {"props.entity_obj.platform.type": "qaAutotests"}}] } } } } } } } ```

I have just tried it, nothing changed. The percolation time is the same.

benwtrent commented 1 week ago

@garipovazamat is the key difference you have found in speed related directly to geo bounding box? Or is it ANY of the filters?

garipovazamat commented 1 week ago

@benwtrent I think ANY of the filters. If I remove geo bounding box filter, it become not much faster.

john-wagster commented 4 days ago

@garipovazamat sorry for the slow response. Give me a bit more time here and I'll take a look at your queries and see if I can optimize them; I think it's probably a good exercise that we can reflect back into the docs as my suspicion here is it is possible (yep I'm an optimist) to optimize these, does require some effort, and is a little non-intuitive.

I do think Ben's question around the geo bounding box will be relevant too. I know previously there was an attempt to add optimization around how geo percolator queries were handled but it was abandoned. I'll dig into that some more but using geo as a bounding box might be problematic.

elastic / elasticsearch

Percolator is much slower than in ES1, and pre-selecting do not work #114392

Experiment details

Conclusion

Experiments

Setup

Exp 1: 1.7.6 - baseline

Exp 2: 8.15 - orig percolator query

Exp 3: 8.15 - orig percolator query

Exp 4: 8.15 - optimized percolator queries (must_not or filter or both!)

Exp 5: 8.15 - optimized percolator queries (break up queries)

Experiment on data similar to production

Data for ES8

Data for ES1

Results