opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.68k stars 1.79k forks source link

[BUG] Search Backpressure effects Searchable Snapshots results which are inherently slow #15193

Open AmiStrn opened 2 months ago

AmiStrn commented 2 months ago

Describe the bug

When enabling Search backpressure, the setting is cluster-wide and effects queries that run on search nodes. The reason this is a bug is that the search backpressure is operating on global thresholds that don't take into account the inherent slowness of the searches that run against snapshots.

Related component

Search:Searchable Snapshots

To Reproduce

  1. restore several big indices (88 shards 30gb each in our case), no warm up (no cache pre-loading).
  2. ensure that 100+ search requests are running at the same time (1-2 hours range, heavy ones).
  3. wait for search backpressure to reject some query
  4. In the query response, there will be diff with expected number of hits. Field _shards will show something like "total": 44, "successful": 40.

We had separate roles, 20% of nodes with role search and the rest with the data role.

Expected behavior

A few options:

Additional Details

Plugins s3 repository

Host/Environment (please complete the following information):

mch2 commented 2 months ago

Thanks for reporting @AmiStrn. @jainankitk @kaushalmahi12 is there something we can do to help here with wlm?