Improve RUM observability and monitoring of single page apps

w3c / charter-webperf

Web Performance Group charter

https://w3c.github.io/charter-webperf/

27 stars 11 forks source link

Improve RUM observability and monitoring of single page apps #27

Closed igrigorik closed 6 years ago

igrigorik commented 7 years ago

(discussed at TPAC 2016; rough notes and summary of discussed ideas below)

SPAs make automated monitoring tricky: with page navigations/reloads automated solutions have clear markers for start/finish of major user interactions, whereas within a single-page-app it's often very hard to automatically identify when such transitions occur. The emerging pattern is: listen for any pushState changes, custom instrument popular frameworks to hook into major lifecycle events.

Question: is there anything we (webperf group) could or should do to assist with the above?

In no particular order, some ideas and points that were discussed:

Browsers could automatically emit *PerformanceEntry into the timeline on pushState and other (?) "meaningful" transitions.
- It's not clear how much value there is here, as the browser is not in a significantly better position to identify start/end of "meaningful" user interactions.
Browsers recently started surfacing User Timing marks in their developer tools. Anecdotally, this has led to a noticeable uptick in it's use.. which is good indication that aligning incentives with local devtools + RUM monitoring pays good dividends.
- Perhaps we can define a user timing mark format {well-known-prefix}-{tbd-format} convention for SPA's that can allow apps and framework authors to decorate their applications with meaningful transitions? In turn, devtools could use such marks to decorate local timeline to assist with local development.. and RUM solutions can harvest same data in production. Chrome's traceviewer did this last year, we should investigate what they did.

igrigorik commented 7 years ago

@paulirish @nicjansma @bluesmoon fyi, would love to hear your thoughts.. It'd be also really great to pull in some framework developers to see if there is interest and/or previous or existing work that we should consider.

nicjansma commented 7 years ago

(1) Some sort of PerformanceEntry for pushState could be useful. To monitor pushState and replaceState today, we have to monkey patch them, which isn't ideal.

In fact, a PerformanceEntry helps in another way -- when third-party monitoring solutions load "late" (e.g. after the thing they want to monitor), they miss the events if they're not logged elsewhere. Having a PerformanceEntry means we can review history to check on things that already happened.

(2) For boomerang.js, we listen to SPA framework events, pushState/replaceState and XHRs to guess when an "important" user interaction begins so we can do additional monitoring for soft navigations. I like the idea of a standard/convention for mark names that frameworks could opt-in to to help us with this. Or, the monitoring solutions themselves could fire them if the framework doesn't support it.

(Though listening for UserTiming events will really require global PerformanceObserver support so we're not polling performance.getEntriesByName)

bluesmoon commented 7 years ago

I wonder if we need some kind of "ResourceGroup" definition to separate out resources requested as part of parallel SPA interactions. For example, if two separate sections of the page need to be updated and the update happens in parallel, it's very hard right now for us to determine which resources belong to which subsection.

It would be good if we could give framework authors the ability to mark resources as belonging to a certain group when they are requested. Not sure how this would be done with resources requested via injecting innerHTML into the page though.

paulirish commented 7 years ago

I'm a huge fan of Entries added for pushState() and popstate event. The latter is observable, so might not be neccessary, but the former is key information and otherwise unobservable.

Chrome DevTools is so ready to add support for #2. Once we have a modicum of consensus on naming, we can ship an experiment to try things out.

cc @mjackson for insight as framework author of mjackson/history and ReactTraining/react-router. How does the two proposals up top sound to you?

igrigorik commented 7 years ago

(1) Some sort of PerformanceEntry for pushState could be useful. To monitor pushState and replaceState today, we have to monkey patch them, which isn't ideal.

@nicjansma is the code public / do you have any pointers?

PerformanceEntry helps in another way -- when third-party monitoring solutions load "late" (e.g. after the thing they want to monitor), they miss the events if they're not logged elsewhere. Having a PerformanceEntry means we can review history to check on things that already happened.

That's not true as of today. Assuming such events are emitted via PerformanceObserver, you have to register the observer to get the notifications. We've had a few discussions in the past about ~buffering such events before onload, but we don't have a resolution on that.

(Though listening for UserTiming events will really require global PerformanceObserver support so we're not polling performance.getEntriesByName)

You can avoid polling if you feature detect PerformanceObserver -- correct?

It would be good if we could give framework authors the ability to mark resources as belonging to a certain group when they are requested. Not sure how this would be done with resources requested via injecting innerHTML into the page though.

That's an interesting idea. However, can't think of any obvious way to enable this on a per resource level. The crude version of this would be to divide the timeline into separate epoch's.. but that doesn't address overlapping cases and arguably achievable by a mark convention.

I'm a huge fan of Entries added for pushState() and popstate event. The latter is observable, so might not be neccessary, but the former is key information and otherwise unobservable.

@paulirish I'm not too familiar with all the various events, but what about hashchange? Stepping back.. I'd love to see someone put together a library that captures all these events -- I think it would help clarify what we can and cannot capture, and whether perf timeline integration is necessary.

Chrome DevTools is so ready to add support for #2. Once we have a modicum of consensus on naming, we can ship an experiment to try things out.

Cool!

Re-reading through the catapult docs, the convention they settled on is pretty simple:

group-name:sub-task-name/optional-base64-args

I'm not sure how I feel about the base64 args yet (and if we need that), but the group:task pattern seems straightforward and useful. With that in mind, focusing the latter.. If we use that convention to group metrics, do we need to also establish additional conventions for the names of the group? E.g. spa-{}.. what would that give us?

mjackson commented 7 years ago

Thanks for the ping, @paulirish. I'll do my best to be useful :)

It's not clear how much value there is here, as the browser is not in a significantly better position to identify start/end of "meaningful" user interactions.

I agree with you here, @igrigorik. pushState/replaceState and popstate are too low-level to be able to derive any meaningful user interactions from them.

Perhaps we can define a user timing mark format

That sounds like a reasonable approach. Can you give me some examples of names that you might expect to see? That would help me better understand what you're suggesting. Maybe something like these?

navigation-start
navigation-cancel
navigation-end

There are a host of other things that people traditionally need to do when building stuff on the history API. It usually boils down to one of:

Animating the transition
Loading code/data
Managing scroll position

It may be interesting to consider additional marks for these categories of tasks.

philipwalton commented 7 years ago

PerformanceEntry helps in another way -- when third-party monitoring solutions load "late" (e.g. after the thing they want to monitor), they miss the events if they're not logged elsewhere. Having a PerformanceEntry means we can review history to check on things that already happened.

@igrigorik I could be misinterpreting what @nicjansma meant in his comment, but I think this issue is relevant to third-party analytics solutions that want to help SPAs (or any site, really) track User Timing metrics. I'm specifically referring to the issue brought up in https://github.com/w3c/user-timing/issues/3 that relates to adding additional metadata to performance entries.

Let me offer an example to help make the point. Suppose I'm writing a third-party analytics library that allows developers to register start and end mark names and automatically measure and report the duration between them to an analytics back end. In this example I asynchronously load my analytics library and once it's loaded I tell it to measure and report the duration between pagefetch:start and pagefetch:end marks.

The problem is if this library takes a while to load, it's possible the user will have made multiple page fetches and now my analytics library can't know which fetch duration is associated with which page request (and thus which page path).

In the discussion of https://github.com/w3c/user-timing/issues/3 you suggest that it'd be easy for libraries to wrap the mark and measure methods to add this metadata. This is true for first-party libraries but not for third party libraries loaded asynchronously, and we wouldn't want developers to resort to loading their analytics libraries upfront just to handle this use case.

I think for any sort of generic, platform-based marks on pushState, it's an absolute must to at least know the page path associated with each mark, and if we're adding that metadata, I think reopening the discussion on adding generic object metadata to performance entries makes sense.

nicjansma commented 7 years ago

(1) Some sort of PerformanceEntry for pushState could be useful. To monitor pushState and replaceState today, we have to monkey patch them, which isn't ideal.

@nicjansma is the code public / do you have any pointers?

Boomerang measures React and other SPAs today via this plugin that hooks into window.history calls.

PerformanceEntry helps in another way -- when third-party monitoring solutions load "late" (e.g. after the thing they want to monitor), they miss the events if they're not logged elsewhere. Having a PerformanceEntry means we can review history to check on things that already happened.

That's not true as of today. Assuming such events are emitted via PerformanceObserver, you have to register the observer to get the notifications. We've had a few discussions in the past about ~buffering such events before onload, but we don't have a resolution on that.

You're right, I forgot that UserTiming and ResouceTiming are special cases in that they have buffers that live until cleared or full.

I agree that our discussion on how long to keep PerformanceObserver entries -- until at least onload? -- would be useful to continue. While it's not guaranteed RUM solutions will load by onload (especially if they're using async loader patterns), we could assume that if the site is interested enough they could make sure that either the RUM script or the loader script could turn on the right observers by onload.

You can avoid polling if you feature detect PerformanceObserver -- correct?

Absolutely, yes -- just pointing out we're looking forward to global PerformanceObserver support :)

@philipwalton: I think for any sort of generic, platform-based marks on pushState, it's an absolute must to at least know the page path associated with each mark, and if we're adding that metadata, I think reopening the discussion on adding generic object metadata to performance entries makes sense.

Great point!

igrigorik commented 7 years ago

@philipwalton: I think for any sort of generic, platform-based marks on pushState, it's an absolute must to at least know the page path associated with each mark, and if we're adding that metadata, I think reopening the discussion on adding generic object metadata to performance entries makes sense. Great point!

Yep, this all makes sense. That said, speccing and implementing support for such mechanism is at least multiple quarters out. In the meantime, is there a simple(r) convention-based solution that we could agree on and prototype? E.g.. while base64-args approach (see https://github.com/w3c/charter-webperf/issues/27#issuecomment-252992669) is not pretty, it can accommodate this.

@mjackson I was hoping you could help define some of the groups/metrics :-)

navigation:{start, cancel, end} --> group: navigation, mark: start, cancel, or end.
navigation:start/args?

Is "navigation" right group name? Are there other group names we care about?

mjackson commented 7 years ago

How about these?

navigation:{start, cancel, end}
- args could be type={push, replace, pop}, from=url, to=url
- also including history.state somehow in from/to could be useful
load-resource:{start, cancel, end}
- args could be href=url, method?

I also like the group name transition instead of navigation, though it's already being used in CSS. But I do think the word transition may convey a bit better the idea that there's some time involved in "transitioning to" a new page.

You already most likely have groups for gathering metrics around loading resources, but it may be interesting to somehow correlate them with specific navigation events.

philipwalton commented 7 years ago

Yep, this all makes sense. That said, speccing and implementing support for such mechanism is at least multiple quarters out. In the meantime, is there a simple(r) convention-based solution that we could agree on and prototype?

It's not terribly hard to polyfill a possible future API. Here's an example that extends the performance.mark() and performance.measure() methods to accept a data object that it stores on the PerformanceMark/PerformanceMeasure instance. It also adds a reference to the startEntry and endEntry on PerformanceMeasure objects so you can access the original mark data from a measure.

(function() {

  if (window.PerformanceMark && 'data' in window.PerformanceMark.prototype) return;

  function noop() {};
  function getEntry(name) {return window.performance.getEntriesByName(name).slice(-1)[0];}
  window.performance = window.performance || {};
  window.performance.mark = window.performance.mark || noop;
  window.performance.measure = window.performance.measure || noop;
  window.performance.getEntriesByName = window.performance.getEntriesByName || noop;

  var nativePerformanceMark = window.performance.mark;
  window.performance.mark = function(name, data) {
    nativePerformanceMark.call(window.performance, name);
    getEntry(name).data = data;
  };

  var nativePerformanceMeasure = window.performance.measure;
  window.performance.measure = function(name, startName, endName, data) {
    nativePerformanceMeasure.call(window.performance, name, startName, endName);
    var measure = getEntry(name);
    measure.startEntry = getEntry(startName);
    measure.endEntry = getEntry(endName);
    measure.data = data;
  };

}());

And this minifies to an amount I'd be comfortable inlining in the <head> of my HTML templates:

!function(n){function r(){}function a(r){return n[t][i](r).slice(-1)[0]}var t="performance",e="mark",c="measure",i="getEntriesByName",o="PerformanceMark",f="data";if(!(n[o]&&f in n[o].prototype)){n[t]=n[t]||{},n[t][e]=n[t][e]||r,n[t][c]=n[t][c]||r,n[t][i]=n[t][i]||r;var u=n[t][e];n[t][e]=function(r,e){u.call(n[t],r),a(r)[f]=e};var l=n[t][c];n[t][c]=function(r,e,c,i){l.call(n[t],r,e,c);var o=a(r);o.startEntry=a(e),o.endEntry=a(c),o[f]=i}}}(window);

I've recently been thinking about writing an autotrack plugin to automatically send relevant performance entries to GA, and I think these additions would make that doable. Without this it doesn't really work because users can't specify relevant dimensions when making marks.

igrigorik commented 7 years ago

@philipwalton neat. One gotcha I can see with the above is that perf timeline does not guarantee that the entry queued into the global buffer is the same entry that's exposed to all the PerfObservers.. That said, in practice the above does work today in Chrome and FF nightly -- observers receive entry with data payload on it.

Stepping back, roughly the requirements we have so far are:

Ability to specify a task name
Ability to specify a group for each task -- e.g. multiple tasks can belong to a single group.
Ability to provide a payload -- e.g. arguments, debug data, etc.

UT already provides 1, so we're solving for 2 and 3.

performance.mark(name, {group: X, data: {...}})

^ pushing these into a dictionary might be a better solution, as it's explicit about what you're assigning and you're not forced to rely on parameter order / pass in null values for stuff you don't care about. But then this also leads me to https://github.com/w3c/charter-webperf/issues/28... if we solve that, perhaps this entire discussion is moot.

I also like the group name transition instead of navigation, though it's already being used in CSS. But I do think the word transition may convey a bit better the idea that there's some time involved in "transitioning to" a new page.

Yep, I agree.

igrigorik commented 6 years ago

With User Timing L2 going to CR, I think we're now in position to start work on UT L3 and Custom entries that can help address this problem space.

Let's continue this discussion on https://github.com/w3c/user-timing/issues/40. Closing.