elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.6k stars 8.21k forks source link

[APM] Show related errors on transactions in the Timeline view #29688

Closed formgeist closed 5 years ago

formgeist commented 5 years ago

Summary

As we've implemented a related errors on in the Transaction sample header, the next step would be to show for each transaction in the Timeline view, a related errors count and perhaps a highlighted state that there's errors related to this transaction.

Design

The suggested design implementation consists of the following additions/changes;

Transaction header

Consolidate Errors within the second row. Decrease space for Result, so the new division is;

Timeline

00 Timeline enhancements

Kapture 2019-03-12 at 10 04 33

elasticmachine commented 5 years ago

Pinging @elastic/apm-ui

formgeist commented 5 years ago

@zubeio/apm-ui For your consideration, I've made a design proposal for the next step in displaying the related errors on the individual transactions rows in the Timeline. I understand this mean querying related errors for all transactions in the timeline, not sure how you feel about that.

jasonrhodes commented 5 years ago

First quick question: Are we doing the "single line per item" design still?

jasonrhodes commented 5 years ago

lol wrong button, sorry :P

formgeist commented 5 years ago

@jasonrhodes I would very much like to do a single line row, but it's not a must-have for this feature to move ahead. I can make example of how it might look without that design, if it helps.

jasonrhodes commented 5 years ago

@jasonrhodes I would very much like to do a single line row, but it's not a must-have for this feature to move ahead. I can make example of how it might look without that design, if it helps.

Until that ticket is better scoped/understood, it might be good to visualize what things will look like without those changes. Last we looked, it was going to be pretty hard to pull off.


I like the "badge" look for error counts, it works well imo.

My first thought is "there is so much information in this view". Even before this change: the HTTP status code, the transaction name, something in parens? (e.g. (3b), (20), etc, I forget what this is?), ms duration, icon, now error count ... is all of this still useful for "at a glance"? I feel like error count might be the most useful after the transaction name, but it's probably hard to give it emphasis right now...

Another option might be to leave icon / name / optional error badge and when you hover over any line, a popover shows all of the info for just that line in a "quick view" pattern (result, name, duration, link to errors, link to view more details which opens the flyout). Maybe none of this is necessary right now but it does seem like we may keep thinking of more things we want to show up here.

At the very least, maybe down the road we could consider a "Waterfall settings" dialog that lets users turn off certain features, and save their preferences in local storage or kibana saved objects or whatever makes sense.

formgeist commented 5 years ago

Until that ticket is better scoped/understood, it might be good to visualize what things will look like without those changes. Last we looked, it was going to be pretty hard to pull off.

That's fair - I can mock an example of what it might look like based on its current state.

My first thought is "there is so much information in this view". Even before this change: the HTTP status code, the transaction name, something in parens? (e.g. (3b), (20), etc, I forget what this is?), ms duration, icon, now error count ... is all of this still useful for "at a glance"? I feel like error count might be the most useful after the transaction name, but it's probably hard to give it emphasis right now...

Another option might be to leave icon / name / optional error badge and when you hover over any line, a popover shows all of the info for just that line in a "quick view" pattern (result, name, duration, link to errors, link to view more details which opens the flyout). Maybe none of this is necessary right now but it does seem like we may keep thinking of more things we want to show up here.

At the very least, maybe down the road we could consider a "Waterfall settings" dialog that lets users turn off certain features, and save their preferences in local storage or kibana saved objects or whatever makes sense.

The stuff in parenthesis is just my references for the distributed tracing example I was given, in order to make the appropriate nesting on the left side.

I think you're right in that we're displaying too much information up front. As you suggested, there's a number of ways to decrease the information overload. I imagine something like;

Let me mock up examples, and perhaps even supply an interactive prototype so we can test it out.

formgeist commented 5 years ago

Made some mocks and a prototype of some changes I imagine would make the information overload a little less intense in the Timeline.

00 timeline enhancements

kapture 2019-02-27 at 10 30 36

jasonrhodes commented 5 years ago

Wow, this makes the errors and icons stand out so much better IMO! I really like the duration appearing on hover, too. Great stuff.

jasonrhodes commented 5 years ago

I wonder if the transaction icon should be contained in the HTTP status code error "pill" anymore. That error pill could move to where the error badge is so you get something like:

ā‡„ GET * [5xx] (5)

so the icon / name pair are always up front and all the error info is at the end? Mostly just a visual pattern thing but not a huge deal if there's a reason you like having that status code connected to the icon.

formgeist commented 5 years ago

Hmm, that's a good suggestion. I was in favor of keeping the result close to the transaction icon, but I can see your point. Perhaps I'm also worried that the badge and the error count facet will clash too much (especially colour-wise) that it's nicer to keep them separated. I can try it out and see how it looks. Thanks for the input šŸ‘

formgeist commented 5 years ago

So that order would look something like this;

screenshot 2019-02-27 at 12 59 03
formgeist commented 5 years ago

Worst span row, status code and error count colour combination possible, which I guess is good šŸ˜„

jasonrhodes commented 5 years ago

Yeah might be good to pull red out of the possible timeline duration bar colors? Then if you want you could just make http error code the same red as the error color, just "something went wrong red", or leave it the way it is, but it'd be nice to reserve red for errors probably.

formgeist commented 5 years ago

Yeah, already considered pulling out the ones we use for 4xx and 5xx in TPM, and this maroon is also too closer the danger colour. Still not sure which ordering I prefer, let's see if we get some more feedback from the others.

formgeist commented 5 years ago

Updated the description with new mocks and a GIF that demos the show duration on hover.

formgeist commented 5 years ago

@makwarth Thoughts on this design proposal for showing related errors in the Timeline?

makwarth commented 5 years ago

@formgeist I think it looks good. This will be a great addition. Few comments:

formgeist commented 5 years ago

Thanks @makwarth - I agree that we can remove the vertical indication for each row. A few comments to your points;

"(2) Related errors" relates to the active transaction sample, right? If yes, shouldn't this info be under the "Result" label? Since this is a RUM transaction, the result label value be "N/A. 2 related errors." We could also create a new label "Errors".

Yes, that replaces the existing Errors: View 2 errors in the StickyProps header to make for a more consistent look and feel for the related errors component. RUM I guess is special because it would not return a transaction.result, so either we should reconsider if we always show Result for RUM or optionally for any language in the StickyProps. Am I missing something?

Nitpick: I like the span error count badge, but is it odd that we have rectangular label for status code and round label for error count? Should they both be rectangular?

I think the difference makes them stand out from each other. Let me provide an example of making them rects like the badges and see which we prefer.

formgeist commented 5 years ago

Here's some revised designs based on the feedback:

00 timeline enhancements

00 timeline enhancements copy

makwarth commented 5 years ago

I definitely think the related errors should be in the transaction sample header

we should reconsider if we always show Result for RUM

++. This is just something we never got around to re-evaluating as we were trying to keep the UI as generic as possible. However, it's obvious that especially RUM will differ in many ways from the backend services UI, so might as well get to it.

formgeist commented 5 years ago

OK, so let's move forward with the latest mocks (I'll update the description and convert to an implementation issue). In regards to the RUM amendments, I reckon that should be part of a bigger UI review where we can take more elements into account. I know @roncohen already started a proposal for changing the transactions list so it has some more logical navigation when you're analyzing RUM data https://github.com/elastic/kibana/issues/26544 - I'll start an issue to begin collecting these ideas and changes.

formgeist commented 5 years ago

Updated description with new design proposal and run-down of tasks involved. Moving this to implementation.