elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.63k stars 8.23k forks source link

[Fleet] Define UX design for information that is displayed when the user clicks on “Agent” to view its details #81872

Open ravikesarwani opened 4 years ago

ravikesarwani commented 4 years ago

Scenario: In the top Fleet page (where all the Agent is listed) the user sees that a particular Agent is not “Healthy” and clicks the “Agent” to view its details and debug the issue.

The view is designed to detail all the information (or link) to help users debug an issue with the Agent.

Following details needs to be captured in the design:

Status: Status of the Agent in broken into the following pieces

When any of the status is not healthy we need to show information that helps the user debug the issue:

Historical view of Agent status (nice to have): The historical view of the Agent status is nice to have for initial release.

Agent and its sub processes (beats) logs: Logs help users debug issues with Agent.

Agent and its sub processes (beats) metrics: The Agent and its sub processes metrics helps the user root cause any performance related issues. Key questions the user is trying to answer:

This metrics view helps user understand:

Agent as a supervisor runs other processes (beats) to perform the main tasks.

From a user perspective it's critical that the metrics can be viewed for the Agent as a whole (by default) but also be able to filter for a specific sub process. This is critical in debugging issues with a specific integration.

The logs and metrics should be filtered by the same filter and time picker to speed up investigation on an Agent issue by the user.

CC: @hbharding @mukeshelastic @mostlyjason @ph @nchaulet

ph commented 4 years ago

cc @jen-huang

hbharding commented 4 years ago

Link to wireframes (in progress, will review with team this week)

hbharding commented 3 years ago

Update

cc @ravikesarwani @mostlyjason @mukeshelastic @nchaulet @ph

View Figma file

During 7.11 and 7.12, we will make incremental changes to improve Agent Observability in Fleet. By 7.12, the goal is to provide detailed status information about an Elastic Agent's integrations and its inputs, as well as provide the user ability to view logs and metrics so they can diagnose and fix specific issues. Some of the more detailed information and functionality won't land until 7.12, but we will be making some UI changes in 7.11 in anticipation of this improved functionality.

I've broken this message into 2 parts for 7.11 and 7.12. There is a lot to unpack, so please ask questions or leave feedback if you have any.


7.11

In 7.11, we'd like to change how we display status information on the Agents table in Fleet. An Agent's status can be one of the following:

Note: if it's not possible to refactor the agent status to match the above, we can reuse our existing statuses but still fit them into the design that I am proposing in the screenshot below.

image

Some of this information was previously shown globally in the header area for this page next to the "Add agent" button. In 7.11, we'd like to change this so that the information is shown directly above the agent table using a colorful "status bar". The status bar shows a breakdown of agent statuses, and it will update its numbers and display based on whatever query or filters are applied above. This will help users understand the breakdown of health statuses for a filtered group of agents. One other small change is how we represent agent status. Previously, we were using EuiHealth component in the table column, but instead we'd like to use EuiBadge which allows for more emphasis to placed on the color. Screenshot below:

When a user clicks on a host name from the agents table, they are taken to the agent details page. This page will have a few updates in 7.11:

image


7.12

In 7.12, we intend to extend the UI so that we are able to report status information and metrics for individual integrations. If an integration is labeled as "unhealthy" (which means an error was found for one or more corresponding inputs in the agent logs), this information will extend to the agent's overall health. If all integrations are healthy, then the agent's overall status will be labeled as "healthy". In the expanded state for the list of integrations, we want to list the integration's inputs, and for each input, show the last message received (red if it's an error message), a sparkline indicating the input's event rate over the past hour, and an additional action link to view metrics about this input in an Elastic Agent dashboard. This dashboard does not exist yet and will need to be created for 7.12 (cc @ravikesarwani). The dashboard should include similar metrics that we show in Stack Monitoring for Beats instances, and (if possible) it should include filter buttons to isolate metrics for a specific integration and input.

image

image

mostlyjason commented 3 years ago

Thanks @hbharding!

hbharding commented 3 years ago

Small update, per https://github.com/elastic/kibana/issues/83330, we want provide a way for the user to change an individual agent's logging level.

I've updated the Figma file and added a select input to change the agent logging level beneath the new logs stream component. This input will only appear for agents >= 7.11.

Kapture 2020-11-23 at 09 53 31

Users can also see the current agent logging level on the agent detail page where we show agent metadata

image

See screenshots for step-by-step details --- **1. Default state with the currently applied logging level** ![image](https://user-images.githubusercontent.com/847805/99978365-da300080-2d73-11eb-9c31-c2a256d8137b.png) **2. User selects a new logging level. A button appears that says "Apply changes"** ![image](https://user-images.githubusercontent.com/847805/99978446-f5027500-2d73-11eb-813b-c6726acdb27d.png) **3. After user clicks "apply changes", the select input becomes disabled while the system waits for a response. The button changes to a loading state that says "Applying changes..."** ![image](https://user-images.githubusercontent.com/847805/99981763-cd151080-2d77-11eb-8b43-6e711141e2bb.png) **4. After a response returns, the UI displays a toast indicating what changed. UI state returns to step elastic/beats#1** ![image](https://user-images.githubusercontent.com/847805/99980471-4875c280-2d76-11eb-8a66-6939b40210f6.png)
ravikesarwani commented 3 years ago

Here's an example why "Events Rate" data is so critical for Agent metrics dashboard. cc: @ph @nchaulet