elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.77k stars 8.17k forks source link

[ML] AIOps: Log Rate Analysis improvements. `[meta]` `[backlog]` #187683

Open walterra opened 3 months ago

walterra commented 3 months ago

backlog/meta issue.

For each release we will create an issue like https://github.com/elastic/kibana/issues/181111 where we move over items we plan to pick up.

### API
- [ ] Improve error handling for indices without time fields (The UI for now protects us from hitting this error). (WR)
- [ ] Optional search queries from the query bar are now passed on to the analysis endpoint as non-optional stringified JSON. It would be good to make that query optional and be passed on as the native query object for better type/schema handling. COULD
- [ ] As part of the query, runtime fields can be passed on, but we don't consider the runtime mappings so these queries will return no results
### UI
- [ ] should switch to Rerun analysis  copy if we change up the search in the search bar, otherwise hover data in the main chart doesn't really match anymore
- [ ] Make the main histogram sticky (like in Discover)
- [ ] Time picker: Auto refresh should be disabled
- [ ] Time picker: Selecting a time range requires another "refresh" button click
- [ ] Make use of the Kibana `fieldFormats` plugin instead of a custom approach to format dates.
- [ ] Nice to have/revisit: Provide UI to allow the user to override the sample probability similar to data visualizer
- [ ] Let user choose between ZOOM and BRUSH mode
- [ ] Persist brush positions in the URL - currently the brush is lost when refreshing the page, or returning to the page from the link to Discover. We should also try to persist the current analysis in local storage so when a user navigates away and hits the back button they won't have to rerun the analysis. https://github.com/elastic/kibana/issues/146166
- [ ] Translate error messages.
- [ ] Styling updates https://github.com/elastic/kibana/issues/156605
- [ ] If an index doesn't have any `keyword` type fields, we should indicate that with a proper user facing message and fail the analysis early. At the moment we just have the generic message that says the analysis couldn't identify any statistically significant fields.
- [ ] Persist columns selected for display in the local storage (or similar)
- [ ] Add proper sorting to `Log rate change` column (follow up to https://github.com/elastic/kibana/pull/186342)
- [ ] The combination of common field values (as being retrieved for example for the "Top field values" popover to provide context) and significant terms could allow the creation of a view showing a topography/network/service-map like view that creates a network out of the results from `frequent_item_sets`. The user would see a network with common and significant items combined and the significant one being highlighted to stand out.
- [ ] Pass any query defined in the query bar to Discover (and Log Pattern Analysis?) when drilling down from a row in the table
- [ ] Improve user workflows https://github.com/elastic/kibana/issues/153753
- [ ] While the analysis is running, block users from editing the date picker, search bar and baseline/deviation brushes.
- [ ] For 100Mio+ docs, the sample rate in the UI would end up as `0`, for example `Total documents: 388,497,198 Sampling probability: 0`. It's just that the displayed value gets rounded down, the correct one is used for the analysis.
- [ ] https://github.com/elastic/kibana/issues/171657
- [ ] If the current view returns 0 docs for the time range and date histogram, we should update the empty prompt accordlingy and block users from clicking the date histogram to initialize the brushes
- [ ] If you load the page from a saved search, it will be applied to the search bar. If you then change the query, we will remember that when you reload the page, so it gets stored in URL state and will override the saved search. However, if you just empty the query bar we will not remember that and it will be repopulated from the save search on reload.
### Analysis
- [ ] At the moment we categorize groups in a way that doesn't allow to have different values for the same field within a group (unless they were part of the same doc in an array for the field). This can lead to very granular groups, for example when the same attributes end up in a lot of groups where the only different value is an IP address. It would be great if we could update the grouping to cluster matching values from the same field into a single group.

Text field pattern support was added in 8.11, see #167467. These are follow up tasks carried over from that original issue:

### Text field pattern support
- [ ] Truncated values in table columns: Find a way to let the user access/see the full value.
- [ ] Fix top field values for text fields. Maybe a different popup is needed here that just shows additional log pattern message examples. (disabled for now for text fields)
- [ ] Fix disabled action button to link to log pattern analysis behavior https://github.com/elastic/kibana/pull/165124#discussion_r1341490645 https://github.com/elastic/kibana/pull/183649
- [ ] Revisit use of spread operators to avoid performance issues.
- [ ] Avoid `.find()` in loops.
- [ ] Apply `searchQuery` to when getting `term2category` counts.
- [ ] Investigate if we can keep `match_only_text` as part of supported text fields and only remove when not primary, see the discussion here: https://github.com/elastic/elasticsearch/issues/106166
### Observability AI Assistant Context
- [ ] Users have the option to view the top field values for each result row for more context on a field and its values. At the moment, the AI Assistant is lacking that context. For example, in one of our example datasets Elasticsearch shows up as a significant term and the user can check top field values to find out that Kibana is another common term of that field. We could fetch top terms for each significant field and make it part of the analysis results being passed on via the callback `onAnalysisComplete()`.
- [ ] The updated AI Assistant makes the prompts visible and stores them in a chat history. We should revisit the prompt texts to make sure they are written in a way we want them to be exposed like that to users.
- [ ] Consolidate code, at the moment there is a lot of code duplication for the contextual insight across usages.
- [ ] "Reset" should reset everything, including the contextual insight. https://github.com/elastic/kibana/pull/186509#discussion_r1647732552
### Telemetry
- [ ] Track the number of times users click the "Smart grouping" button.
- [ ] Track the number of times users click the "Cancel analysis" button

The workflows task list was brought over from this original meta issue #153753.

### Workflows
- [ ] There seems to be an edge case where a link from anomaly detection misses to select the correct spike: ![image](https://github.com/elastic/kibana/assets/230104/d09e768b-9e66-481c-ba58-4775f2fa268f)
- [ ] Store analysis results in local storage to be able to implement working browser back button and restore state. https://github.com/elastic/kibana/issues/146166
- [ ] Analysis management: Offer solution to save and restore analysis as saved objects
- [ ] Allow user to create an Anomaly Detection job from Log Rate Analysis.
- [ ] Offer option to add analysis result to a case
- [ ] Offer option to be able to run Log Rate Analysis automatically as part of an alert based on a `high_count` anomaly detection job.
elasticmachine commented 3 months ago

Pinging @elastic/ml-ui (:ml)