elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.71k stars 8.13k forks source link

Filtering via +/- buttons against match_only_text fields really slow #171441

Open georgivalentinov opened 10 months ago

georgivalentinov commented 10 months ago

Kibana version: 8.8.2

Elasticsearch version: 8.8.2

Describe the bug:

A while back Elasticsearch introduced the match_only_text field type as an alternative of the well known text one. The former saves space in favour of no scoring and worse performance on phrase searches. Tests with our data sets shows phrase match searches are more than 20 times slower with the new field when compared to the text one.

Recent Elasticsearch versions have replaced the type for all text fields for the ECS preset with the newly introduced match_only_text. This includes the most used field of all - message.

Why a Kibana issue?

Contract for the new field type kind of declares that the field would perform slower in certain situations, so it's left to the client(s) to decide when and whether to use it.

The thing is a basic Kibana feature uses phrase match search for all fields, including all match_only_text ones. That's the Filter out value/Filter for value (+/- buttons) next to each field in the results set (see screenshot below). This creates a phrase match (drop down) filter and is arguably the most used way by people to filter for/out values.

So now filtering for text values has become 20+ times slow with recent Elasticsearch/Kibana versions.

Steps to reproduce:

  1. Create an index mapping containing two fields, one typed as text and the other one as match_only_text.
  2. Populate with documents containing the same multi word value for both fields.
  3. Compare performance when filtering against both types.

Expected behavior:

Filtering using Filter out value/Filter for value (+/- buttons) against textual data is fast, as was until recently.

Screenshots (if relevant):

Screenshot 2023-11-16 at 21 07 18
elasticmachine commented 9 months ago

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

kertal commented 9 months ago

thx for reporting this @georgivalentinov , we will have a look

So now filtering for text values has become 20+ times slow with recent Elasticsearch/Kibana versions.

you mean for the match_only_text field type, right

kertal commented 9 months ago

Can you share a performance profile for the slow query you're experiencing? You can get a link to our search profiler in the Inspector: Discover_-_Elastic_und_Filtering_via___-_buttons_against_match_only_text_fields_really_slow_·_AppEx__AnalystXP_-_Data_Discovery_Team thx!

georgivalentinov commented 9 months ago

Hi Matthias, thanks for looking into this.

you mean for the match_only_text field type, right

yes

Can you share a performance profile ...

I wondered about including such, but omitted them as I thought the huge performance diff between text and match_only_text may be known. Here are the original profile outputs for both cases:

image image

They're from the following test setup:

Seems like a match_phrase search against a match_only_text field uses SourceConfirmedTextQuery, which may or may not be the source of the slow down.

So again, most people just use the +/- buttons to filter for/out values, which, combined with the new text type, unfortunately turned into a performance killer. Searches against bigger data sets can now take minutes.

I'm available for more feedback, if needed. Thanks upfront.

kertal commented 9 months ago

thx, we will have a look!

kertal commented 1 week ago

@lukasolson I think there's nothing we can do here from Kibana side, right? so I would consider this issue as blocked