Open georgivalentinov opened 10 months ago
Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)
thx for reporting this @georgivalentinov , we will have a look
So now filtering for text values has become 20+ times slow with recent Elasticsearch/Kibana versions.
you mean for the match_only_text
field type, right
Can you share a performance profile for the slow query you're experiencing? You can get a link to our search profiler in the Inspector: thx!
Hi Matthias, thanks for looking into this.
you mean for the match_only_text field type, right
yes
Can you share a performance profile ...
I wondered about including such, but omitted them as I thought the huge performance diff between text
and match_only_text
may be known. Here are the original profile outputs for both cases:
They're from the following test setup:
text
(named the same way - text
), the other of type match_only_text
(again, named the same way - match_only_text
).match_phrase
for "v2 processsubmissionjob", which is what Kibana does in the case described above for filtering for/out values.Seems like a match_phrase
search against a match_only_text
field uses SourceConfirmedTextQuery
, which may or may not be the source of the slow down.
So again, most people just use the +/- buttons to filter for/out values, which, combined with the new text type, unfortunately turned into a performance killer. Searches against bigger data sets can now take minutes.
I'm available for more feedback, if needed. Thanks upfront.
thx, we will have a look!
@lukasolson I think there's nothing we can do here from Kibana side, right? so I would consider this issue as blocked
Kibana version: 8.8.2
Elasticsearch version: 8.8.2
Describe the bug:
A while back Elasticsearch introduced the
match_only_text
field type as an alternative of the well knowntext
one. The former saves space in favour of no scoring and worse performance on phrase searches. Tests with our data sets shows phrase match searches are more than 20 times slower with the new field when compared to thetext
one.Recent Elasticsearch versions have replaced the type for all text fields for the ECS preset with the newly introduced
match_only_text
. This includes the most used field of all -message
.Why a Kibana issue?
Contract for the new field type kind of declares that the field would perform slower in certain situations, so it's left to the client(s) to decide when and whether to use it.
The thing is a basic Kibana feature uses phrase match search for all fields, including all
match_only_text
ones. That's theFilter out value
/Filter for value
(+/-
buttons) next to each field in the results set (see screenshot below). This creates a phrase match (drop down) filter and is arguably the most used way by people to filter for/out values.So now filtering for text values has become 20+ times slow with recent Elasticsearch/Kibana versions.
Steps to reproduce:
text
and the other one asmatch_only_text
.Expected behavior:
Filtering using
Filter out value
/Filter for value
(+/-
buttons) against textual data is fast, as was until recently.Screenshots (if relevant):