elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.72k stars 8.14k forks source link

[Unified Search] Value autocomplete without field name specified #193608

Open flash1293 opened 1 day ago

flash1293 commented 1 day ago

The value-autocomplete functionality in the unified search bar for KQL is super helpful to search for more complex values like host names and similar: Image

However, a big downside is that it requires the user to know the field name to get autocomplete - if they only know the prefix of the value they are search for, it will be difficult to search effectively for it. In this case, the flow currently looks like this:

This manual process can be automated in the following way: In case the user types in a string in the KQL bar without specifying a field name, search for

"query_string": {
  "query": "<typed string>*"
}

and return the first 100 documents. Within Kibana, search through all fields which have the typed string as a prefix. Use the matched fields as suggestions, suggesting both field name and value to search for. Add them to the regular suggestions in the dropdown, below the regular field suggestions:

https://github.com/user-attachments/assets/6279ec4f-74b4-45a9-9a1f-67c68a6a2b61

Considerations

Suggestions need to return quickly - while the search on all fields for a small amount of documents is more expensive than looking up regular autocomplete values for a single specified field, in tests with medium sized clusters the response time was still acceptable. Two measures can help:

As these requests would be sent quite often while the user is typing, testing needs to be performed to gauge the additional load this would cause on the cluster. Using strategies like throttling and debouncing, it should be possible to fine tune the feature to balance churn and user experience.

elasticmachine commented 1 day ago

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

LucaWintergerst commented 1 day ago

Ideally we'd also get back a somewhat random set of documents, not necessarily the first 100 as they might have very similar values We should look into random scoring, but only if we can apply this only to a subset of the documents

I.e. look at 1000 docs, randomly sort them, then return 100 of them Otherwise we might get a lot of very similar documents that don't have as many unique values.