elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.82k stars 8.21k forks source link

Connectors view improvement #147846

Open shanisagiv1 opened 1 year ago

shanisagiv1 commented 1 year ago

Why? The connector page is moved to a dedicated page in order to help users with connector management. Using the connector page users perform initial setups, connector tests, troubleshooting for active connectors, and audits. What? In order to support and streamline the above use case, we want to add more data per connector. The long-term solution should be a new & enhanced flow for connector creation and management, the following are improvements for the current state. How? Add new columns to the connector table:

As a next step will be good to have also usage metrics per connector - https://github.com/elastic/kibana/issues/159497

Screen Shot 2022-12-19 at 10 59 12
elasticmachine commented 1 year ago

Pinging @elastic/response-ops (Team:ResponseOps)

doakalexi commented 1 year ago

Connector logs view https://github.com/elastic/kibana/issues/147795 - Adding a Logs tab that allows users to see historical activity of connectors from the event log

pmuellr commented 1 year ago

"Last ~sent~ run action" - A timestamp of the last action that was sent, this will help users to understand if the connector works as expected after the initial config. (e.g when the last action was a month ago, the user will know that it's not in use or doesn't work)

We could start storing this in the connector SO, or retrieve it with a query to the event log.

"Errors" - # of existing errors with a link to a sidebar with an error list

For a number like this, we need a range - could be number of hours back (from now), or last number of runs. The Connector / Logs page has time picker that would be appropriate (it's needed for the same reason). If we decided to store the last run action date in the connector SO, in theory we could store the last 200 result runs - maybe times and succeed/fail.

"Rules" - # of alerting rules that use the connector. clicking on the number will open a sidebar with the Rules Ids, names and links to the rules. [Nice to have]

For this, we'll need to do a query against the rule saved objects.


Since we need to do a query against the rule saved objects, seems like we should pull the data for the other columns from the event log with a query (instead of storing in the connector SO). And we do the two queries in parallel. This will be some new HTTP route that does the work here - presumably internal for now?

pmuellr commented 1 year ago

Thinking about re-arranging the row menu:

mikecote commented 1 year ago

cc @mdefazio, realized you weren't tagged in this issue.

mdefazio commented 1 year ago

"Rules" - # of alerting rules that use the connector. clicking on the number will open a sidebar with the Rules Ids, names and links to the rules. [Nice to have]

Do we need to consider non-rule usage as well? Not sure off-hand how many other places use Connectors at this time (Notification policies for user comments?), but wasn't sure if we wanted to try and plan for this? I don't know if it's worthwhile or not honestly, just posting as a thought.

mdefazio commented 1 year ago

Errors... For a number like this, we need a range - could be number of hours back (from now), or last number of runs.

I think I need some help understanding this requirement a bit more.

Am I wanting to quickly know if a connector is currently failing? So this column would indicate if the last run's response was failure and then show something.

Am I wanting to know if this connector failed recently? This would not always be indicative that the connector is currently failing. Right? If I am viewing this page, currently failing connectors (or rather, failed on last run) should have the highest priority / a very clear indicator. Whereas a connector that failed 50 runs ago, is maybe not a big deal? Trying to understand how to handle this in the UI.

Am I wanting to know a historical count of failures?. If so, what is the value of this? Maybe I had issues setting up my connector and it was constantly erroring. So I'm not sure what sorting on this column or a high value in this column would provide me.

pmuellr commented 1 year ago

"Rules" - # of alerting rules that use the connector. clicking on the number will open a sidebar with the Rules Ids, names and links to the rules. [Nice to have]

Do we need to consider non-rule usage as well? Not sure off-hand how many other places use Connectors at this time (Notification policies for user comments?), but wasn't sure if we wanted to try and plan for this? I don't know if it's worthwhile or not honestly, just posting as a thought.

Yes, I would think so, which makes the sidebar more complicated, do all the things have "names"? It was already [Nice to have], so I think we should defer the sidebar anyway. So the column would be something like "objects referencing connector" and just a number?

Listing the referenced objects would be good for a "Connector Details" page (akin to the Rules Detail page), if we ever grow one. Currently we just have the connector list and connector editor.

Errors... For a number like this, we need a range - could be number of hours back (from now), or last number of runs.

I think I need some help understanding this requirement a bit more. ...

Ya, all that :-). We also haven't really talked about how the "Logs" tab aligns here. It currently has some agged stats for it's current view (filters + time range):

image

When you talk about errors, you'll want to know out of how many executions - or successes, etc. So I think Succeeded/Warning/Failed numbers would all have to be in the table, or you produce a score, like Rules List does with "Success Ratio".

Also, in terms of questions like - is it erroring now, or was it in the past - can now be answered via the Connector Logs tab. Though we don't provide precise filtering, just a search bar with some filters on status (not connectors themselves).

In terms of the time range that would be agg'd over, the Rules List page generates stats based on the last 200 runs of a rule. That's actually stored IN the rule. Maybe we want to do the same for Connectors?

Otherwise, we can't really agg over the last 200 runs for each rule individually, in ES (hmmm ... maybe with EQL?), but we could do something like last 24hrs.

Actually, the more I look at the Rules List page, guessing we want the Connectors List page to be very similar. Sortable columns, pick-your-own columns, filtering. I think this is the last alerting thing to spiffy up like that.

Or is that more than we need? Presumably customers don't have as many connectors as they do rules, so may not need all these gadgets.

pmuellr commented 1 year ago

Looks like we have a bunch of questions we need answers to, and potentially redesign for time-pickers, and maybe other "upgrades" similar to our other list-y UX's. I'm going to unassign myself and put this back into triage ...

mdefazio commented 1 year ago

@pmuellr Apologies, if I am conflating this issue with UX concerns. Happy to move these questions into a separate space to better focus the two.

Actually, the more I look at the Rules List page, guessing we want the Connectors List page to be very similar. Sortable columns, pick-your-own columns, filtering. I think this is the last alerting thing to spiffy up like that.

I'm guessing grouping would also be of high value here? Meaning, grouping the table by connector type.

Or is that more than we need? Presumably customers don't have as many connectors as they do rules, so may not need all these gadgets.

I think this is an important question. I know we've been down this road before, but the data grid experience is really optimized for large datasets. If we need column selection and multiple sorting/filtering options, then perhaps we prioritize doing this on the current table—I believe its already been explored in some fashion.

pmuellr commented 1 year ago

Apologies, if I am conflating this issue with UX concerns. Happy to move these questions into a separate space to better focus the two.

Heh, not your fault. I was focusing on just the error count thing mainly, but not sure where this might fit in the bigger picture of re-doing the connector UX (which the top comment seems to hint at). I'll set up a chat for this ... :-)

mdefazio commented 1 year ago

Here's an updated mockup with the following changes. Should still be considered draft as I'm not 100% about some of these changes. But looking for feedback regardless.

image

image

mdefazio commented 1 year ago

Hi, was curious if there was feedback on the previous mockup and where this stood in the queue. If it's something you all would like to work on soon, let me know and I'll try and make any edits before I'm out on vacation.

mikecote commented 1 year ago

Hi, was curious if there was feedback on the previous mockup and where this stood in the queue. If it's something you all would like to work on soon, let me know and I'll try and make any edits before I'm out on vacation.

@mdefazio we're going to revisit the priority of this issue for 8.10. Feel free to de-prioritize it among other things.

cnasikas commented 3 months ago

@shanisagiv1 Should we prioritize the work described in the issue?

shanisagiv1 commented 3 months ago

not atm