elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.72k stars 8.14k forks source link

[ES|QL] - Select Partial Text for WHERE/GROK in ES|QL Queries in Discover #192281

Open ninoslavmiskovic opened 1 week ago

ninoslavmiskovic commented 1 week ago

User Problem:

In ES|QL, users can add full strings to a WHERE clause, but this is limited when dealing with unstructured data (e.g., logs). Users often need to select specific parts of a message field rather than the entire string to refine their query. Without this functionality, users must manually extract and edit the desired text, slowing down their workflow.

This is how it works today where it is possible:

https://github.com/user-attachments/assets/7feff81f-b2e6-4d92-91c1-7d1c9809ec0c

Describe the feature:

[NEEDS DESIGN CHECK]

Enable users to select part of a message field within a cell in Discover. Upon selection, a pop-over menu should appear with options to:

•   Add the selected text to the ES|QL query as a WHERE clause.
•   Apply a GROK pattern for parsing unstructured data.

This will significantly enhance usability when working with complex logs or messages, allowing for more efficient data parsing and querying.

elasticmachine commented 1 week ago

Pinging @elastic/kibana-esql (Team:ESQL)

ryankeairns commented 1 week ago

🤔 I wonder how this might work in the text search use case. Meaning, if we nailed that experience - made it quick to achieve - would that mitigate the scenario described by this issue? Does a quick text search often/by default point to the message field if one exists?

ninoslavmiskovic commented 1 week ago

It would help find it parts within the message, but not parsing it with GROK. Also we need to keep in mind that down the line we will have materialised view , so users would 1) grok 2) save it to a materialised view or view .

So it is not only about the search /filter but also about parsing .

stratoula commented 1 week ago

How would we know that this is a message field and this functionality makes sense? In the majority of the keyword fields the GROK / DISSECT doesnt.

Also which pattern are we going to propose?

ninoslavmiskovic commented 1 week ago

Regarding how we know it's a message field:
The message field is typically mapped as a text field type in most cases where users are dealing with unstructured data like logs, since that allows for flexible searching and analysis. However, I have seen cases where it might be mapped as keyword. So, the focus could be to enable this feature mainly for text fields where it makes sense (logs, messages, etc.).

For the patterns:
We could offer a default GROK pattern based on what we can detect from the data structure (like timestamps, log levels, ip address, host etc.). If that doesn’t fit, users could tweak it or define their own.

Over time, we could also learn common patterns based on usage and user feedback. Same learning as with recommended queries that we need to do IMO.

stratoula commented 1 week ago

Even if we decide to do it on text fields, it wont make sense as many fields can be text but the GROK doesnt make sense. I think that this feature needs more investigation as it is not so straight forward how we can detect the fields that make sense to suggest this.

ninoslavmiskovic commented 1 week ago

Good point, I agree it’s not as straightforward as just applying this to all text fields. A more refined approach could be focusing specifically on certain fields where this feature adds value, like message fields or other fields commonly used for logs and unstructured data.

We could start by:

  1. Targeting Specific Field Names: Instead of all text fields, we could limit this feature to fields that are likely to contain unstructured data— like message, log, error, etc. This would avoid applying the feature where it doesn’t make sense.

  2. Let it be part of Contextual Profiles: We could give the solutions team the ability to define which fields they want this functionality applied to. This way, they have control over what fields make sense, based on the data structure. Maybe even have it as part of the extension point that @davismcphee is wrapping up "Cell Actions", to show an action for GROK for certain context profiles e.g. an "unstructured logs profile".

Let me know WDYT.

stratoula commented 1 week ago

The second sounds promising, if we could get somehow from the solutions teams (contextual fiels, ecs schema) the fields that want this functionality and the proposed pattern would be the best path forward

elasticmachine commented 1 week ago

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

kertal commented 1 week ago

2 minor cents from my end:

Enable users to select part of a message field within a cell in Discover. Upon selection, a pop-over menu should appear with options to:

This sounds like overwriting the default browser behavior, so when users select a part of a string, we assume they want to filter by those, not jusst copy paste something, which I assume is usually the case. I'd recommend to place this on cell action level. Assisting the user with Grok would be very helpful. Now imagine, further down the road, we use AI to have a look at the message and to create a nice grok pattern. Something like this behind the scenes, suggesting some matching patterns:

Image

ninoslavmiskovic commented 1 week ago

@kertal We can already do it with WHERE on a cell level. It would not work in field like message to add the entire string to the GROK, but only parts of it .

ninoslavmiskovic commented 1 week ago

And yes AI is for version 2ish, and something will be as next steps we could consider.

timductive commented 1 week ago

I think we should set a high bar for when we add UX niceties to modify the ESQL query. We are still too early to start building a complex set of user interactions. We should reserve these for only the most heavily used user behavior patterns that we can quantify through heavy ESQL usage.

stratoula commented 1 week ago

++ Tim

ninoslavmiskovic commented 1 week ago

@timductive I always agree with setting the bar high. :)

This isn’t a ‘new’ request in my opinion, as it’s more about evolving the UX to address the ongoing challenge our users face when dealing with unstructured data. GROK and DISSECT are already great enablers that ES|QL helps users leverage to structure data post-ingestion, turning raw logs into usable formats. By allowing users to easily structure their data with these COMMANDS, they can then perform actions like aggregations more efficiently.

The UX evolution here is simply an extension of what’s already a common and essential use case for our users.