grafana / faro-web-sdk

The Grafana Faro Web SDK, part of the Grafana Faro project, is a highly configurable web SDK for real user monitoring (RUM) that instruments browser frontend applications to capture observability signals. Frontend telemetry can then be correlated with backend and infrastructure data for full-stack observability.
https://grafana.com/oss/faro/
Apache License 2.0
690 stars 62 forks source link

Improve how to work with multiple requests from the same page load #558

Open jcarres-mdsol opened 2 months ago

jcarres-mdsol commented 2 months ago

Description

I am not sure if this is a bug or a feature request.

Currently visiting a URL will create one documentLoad trace, this trace will have many spans for resource load, javascript and css files. This looks great.

Screenshot 2024-04-13 at 10 57 24 AM

But this same page will then make many calls to the backend, each call is retrieving the information of a particular widget of which we have 20 in the page. Each of these calls are each of these its own trace, each with a single span because we are yet to instrument the backend. This is very difficult to work with. In Grafana/tempo we need to click one by one to know the url, even if we do select(span.http.url) the UI does not have real state enough to show us the relevant information. It is difficult having so many requests ongoing which come from what page.

It is also difficult to know what requests come from who. This is how it looks like

Screenshot 2024-04-13 at 11 02 40 AM

Were each of this contain 1 span. I imagine if the backend were instrumented then each would contain a number of spans from backend services also.

Would it make more sense that all these calls show up in the documentLoad trace? If not, and probably not, how can we connect them to know they are part of the same page load?

codecapitano commented 2 months ago

@jcarres-mdsol just a heads-up

With the next Faro release we remove the otel document-load instrumentation from the default instruments in the web-tracing package.

This is to reduce data sent. Faro brings it's own related instrumentation so the data is redundant. It's easy to manually add the instrumentation manually.

jcarres-mdsol commented 2 months ago

Well, in our case we really want the web tracing Let me give you an example, the two spans on top is coming from RUM (a different library). The point here is that I can see the click on the browser, then I see the effect in my backend and due to the trace I'll get to see the DB modifications that happened eventually or whatever other level of detail.

Screenshot 2024-04-16 at 9 19 23 AM

To me this is always going to be the view I'll use to check individual actions in the browser. I'll never use the logs because I can't see them in context.

OTOH at this point the logs are necessary because loki can compute arbitrary metrics from them. So when I want to see an aggregation (how many errors per page for instance) I need to work with logs. To me if/when Tempo is able to do these arbitrary aggregations also from spans then there is no point in the logs.

Also, TTFB, INP, etc. I think at some point can be events on a span. But again, right now Tempo does not even allow to search for events in spans, must less to be able to do aggregation on them so the current log-based solution will stay for long time.

codecapitano commented 2 months ago

Hey @jcarres-mdsol this is very good and useful feedback. Thank you. I'll share this with the rest of the team as well.

Cheers, Marco

codecapitano commented 2 months ago

To be sure that this does not lead yo misunderstandings: we of course keep the web-tracing package but remove the redundant DocumentLoadInstrumentation.

It's very easy to add them and to add even more otel instruments via the options object of the TracingInstrumentation class.

new TracingInstrumentation({
        instrumentations: [
          new DocumentLoadInstrumentation(),
        ]
      }),
image

Hint: If you do not use Grafana Cloud and always use the otlp-http-transform you can also remove / not load Faro specific default instrumentation you don't need to reduce bundle size and data sent by Faro.

jcarres-mdsol commented 2 months ago

I think we may have lost the thread of this issue as I intended it.

Let's say I log into my system and I'm in a home page with multiple widgets. Let's say they use React and call some API to render.

This will create multiple fetch log entries and multiple traces. Each fetch would be a traceid.

These traces will then contain 1 browser span and whatever happens in the backend. This is good to see some browser info in the context of the backend call. But limited value to know how the page renders. We have a bunch of different traces and it is impossible to see the whole picture with this.

Some alternative solutions: