elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.65k stars 8.23k forks source link

[APM] Show errors on the timeline instead of under the transaction #45619

Closed roncohen closed 4 years ago

roncohen commented 5 years ago

Summary

Today we show errors with a marker on the transaction in which they happened. It would be useful to show them on the timeline at the correct time at which they happened instead. There are a couple of different options for design we should investigate. For example, do we know in which span the error happened? if we don't, how do we know at which vertical hight on the waterfall to show them?

Design solution

We want to show the errors as marks on the x-axis of the Timeline. We'll be able to leverage existing marks UI implementation to display a mark with a popover with details.

Marks implementation

Figma prototype

Kapture 2019-11-28 at 14 07 42

01 Trace sample with errors on x-axis

ErrorPopover-breakdown

Other related enhancements

elasticmachine commented 5 years ago

Pinging @elastic/apm-ui

formgeist commented 5 years ago

I've picked this up and created an early mock of what I might envision adding the errors to the x-axis would look like and what information we can show when hovering those indicators.

Trace sample with errors on x-axis

Thoughts?

formgeist commented 5 years ago

Based on some feedback in the Observability design sync yesterday, I've added the service name into the popover display for the errors. This will help when viewing a trace with multiple services in understanding in which service the error is happening.

Example_ Trace sample with errors on x-axis (with tooltip)

sorenlouv commented 5 years ago

What happens if several errors happen around the same time? Should they be grouped in a single pop-over to avoid them overlapping?

sorenlouv commented 5 years ago

nit: In the screenshot it shows "1 error" but two error overlays. In a real scenario it would show "2 errors"?

formgeist commented 5 years ago

nit: In the screenshot it shows "1 error" but two error overlays. In a real scenario it would show "2 errors"?

Ofc, please imagine it says "2 errors" ๐Ÿ˜…

What happens if several errors happen around the same time? Should they be grouped in a single pop-over to avoid them overlapping?

Yeah - I don't recall if we managed to implement this grouping for our marks implementation, but they should definitely align in behavior.

sorenlouv commented 5 years ago

I don't recall if we managed to implement this grouping for our marks implementation, but they should definitely align in behavior.

We had some logic that would adjust the position of marks so the didn't overlap. This was not a good solution since it would place marks out of place, and it was removed again. We talked about implement grouping but we never got around to that.

formgeist commented 5 years ago

OK, I think this design relates to the marks implementation, so perhaps we can solve it in this iteration for both cases. I'll make a design for it ๐Ÿ‘

formgeist commented 5 years ago

Made some additional changes based on the feedback. I thought it'd be a good idea to reduce the visuals slightly. I removed the culprit line, but added an actual timestamp for the x-axis value so it's clearer to see at which specific point in the timeline the exception happened. Grouped exceptions also available in the popover.

Kapture 2019-11-01 at 9 57 32

Example_ Trace sample with errors on x-axis (with tooltip)

formgeist commented 5 years ago

Here's an update design after a demo I did a few weeks back wherein we agreed to just display each error in its own, so no longer grouping them, and I've updated the mark to be aligned with our current mark dot but given a red colour.

Kapture 2019-11-20 at 13 12 14

Example_ 01 Trace sample with errors on x-axis

I personally think this is a good start and after we can make the last improvements once we have the errors on the x-axis.

sorenlouv commented 5 years ago

we agreed to just display each error in its own, so no longer grouping them

This is definitely simpler to implement (which I like) but won't this be slightly problematic? I can imagine errors often come in clusters (an error is caught, reported and re-thrown) so several errors can happen almost at the same time. The result being that there'll be several overlapping marks. If we are fine with that behaviour I'm good with this ๐Ÿ‘

formgeist commented 5 years ago

I totally agree, which is also why I had a design going for grouping the errors. I think the conversation around how the grouping would work and how many errors we'd eventually be able to display in a grouped way (at least in my initial designs) needed additional design when there were more than 3 errors available within a certain time period of the x-axis and possibly have an option to go to the errors overview (filtered by trace ID?) for each service that had errors thrown. So we decided to go for the simple now, and if we find that we have a lot of these consecutive errors grouping together and ruin the UX, we can improve. How does that sound?

sorenlouv commented 5 years ago

if we find that we have a lot of these consecutive errors grouping together and ruin the UX, we can improve. How does that sound?

The iterative way, I like it! ๐Ÿ‘

formgeist commented 4 years ago

Does anyone have any objections in me moving this into implementation? We can make adjustments as we have something to look at and demo.

roncohen commented 4 years ago

hey @formgeist. Thanks for pushing this forward. I think it's a good approach to skip grouping for now. Any particular reason to go with the round dot? Did you try a square marker? Maybe it's just me, but i think the red circle on the line looks a bit out of place. Round dots for time measurements and red squares for errors?

formgeist commented 4 years ago

@roncohen I tried differently treatments to the error marks, but decided to keep it circular like the other x-axis marks we have (currently only timings for RUM). I went back and made a mock of how it might look if the errors were treated with a square, and I can see the benefit in differentiation.

01 Trace sample with errors on x-axis

Do you agree?

roncohen commented 4 years ago

Maybe a square with round corners?

Just kidding. That looks good to me

On Thu, 28 Nov 2019 at 13.47, Casper Hรผbertz notifications@github.com wrote:

@roncohen https://github.com/roncohen I tried differently treatments to the error marks, but decided to keep it circular like the other x-axis marks we have (currently only timings for RUM). I went back and made a mock of how it might look if the errors were treated with a square, and I can see the benefit in differentiation.

[image: 01 Trace sample with errors on x-axis] https://user-images.githubusercontent.com/4104278/69807419-8a147180-11e5-11ea-9d49-a9ca6a77390b.png

Do you agree?

โ€” You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/elastic/kibana/issues/45619?email_source=notifications&email_token=AAAAF2HS46BGA5BHZHHCBFTQV64U5A5CNFSM4IWOD262YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFMQHLQ#issuecomment-559481774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAF2BKZ5UP2RLPUHMN7UDQV64U5ANCNFSM4IWOD26Q .

formgeist commented 4 years ago

I've updated the description with the new design and specifications and moved it to the implementation board ๐Ÿ‘