v5 browser support tradeoff is problematic for RUM scripts

sgomes commented 4 days ago

Hey folks! 👋

We're currently building the next version of our RUM script on top of web-vitals, in order to ensure that our handling of CWV metrics is as consistent as possible with CrUX's. CWV metrics aren't the only thing our script collects, but are obviously an important part of it.

We have two main priorities with our RUM script: keeping it as small as possible, to avoid interfering with page performance; and having it work in as many browsers as possible, so that at the very least we have an awareness of the prevalence of older browsers, even if the data we collect on those is limited.

In order to do this, we transpile to older syntax, which works quite well with web-vitals v4. If we upgrade to the v5 RC, however, our bundle size increases by approximately one quarter. The main reason behind the size increase appears to be helpers for new syntax such as for ... of and object spreads, which was added in #525 to take advantage of the reduced browser support, even though the prior code worked correctly.

I don't think our script is alone in trying to gather data from as many browsers as possible; having browsers in which your script breaks means having no awareness of those visits, which is an issue when trying to get a global picture of your audience. Furthermore, the size benefits of switching to the reduced browser support are fairly limited: per bundlephobia, v4.2.3 is about 2.6KB, whereas the v5 RC is 2.2KB; indeed smaller, but by only around 400 bytes or 15% (at least some of which may be due to other changes anyway, such as dropping FID). Don't get me wrong: it's a nice savings! But unfortunately dependants that need ES5 are left with a 1KB increase, rather than a 0.4KB decrease. And as far as I'm aware, things could get significantly worse still if syntax like async / await ever makes its way into the codebase.

As such, since we don't want to lose visibility on older browsers, our only options are to accept the bundle size increase (and the risk of it becoming even larger over time); or alternatively to fork and maintain a parallel web-vitals internally.

Would it be possible to reconsider the browser support decision?

philipwalton commented 2 days ago

Hey Sergio! 👋

Can you talk a bit more about what specific performance data your script is collecting from ES5-only browsers? None of the Core Web Vitals metrics are supported in any ES5-only browser, and the only metric exposed by the web-vitals library that is supported is TTFB, which we see as primarily valuable as a diagnostic of LCP.

I don't think our script is alone in trying to gather data from as many browsers as possible; having browsers in which your script breaks means having no awareness of those visits, which is an issue when trying to get a global picture of your audience.

I get the motivation to collect usage data from as many visitors as possible. I don't think that's the same use case as wanting to collect performance-specific data, in order to improve the experience for those users. I don't believe it's critical (nor is it possible) to do that for all users, which is why even in the initial release of web-vitals we recommended that folks load it using <script type="module">, to avoid requiring older browsers to download a script that couldn't really benefit those users.

I understand that if you want to bundle the web-vitals library into another analytics script you're serving—and you want that script to be able to run in ES5-only browsers—then that will require the entire script to be downleveled to ES5 syntax, which can incur a size penalty.

However, I'm sure you know that there are ways for site owners to mitigate that cost by conditionally serving only the ES5 syntax to ES5 browsers—and that's the approach recommend.

You could also potentially dynamically import the web-vitals script in a try/catch and only invoke the onXXX() functions if the load succeeds. That way you're reducing the total script load for those users.

Would it be possible to reconsider the browser support decision?

My position (as I've written about recently) is that it should be up to site owners to make browser support decision, rather than JavaScript library authors. I fully acknowledge your specific desire to support ES5 browsers, but this is not something that all site owners need, and we do not want to impose additional, unnecessary code on users who do not need it.

Of course there are trade offs here, but we believe it's in the best interest of the web ecosystem to keep pushing forward, and we don't think we're pushing too early.

sgomes commented 2 days ago

Thank you for the considerate reply, @philipwalton! 🙇

Can you talk a bit more about what specific performance data your script is collecting from ES5-only browsers? None of the Core Web Vitals metrics are supported in any ES5-only browser, and the only metric exposed by the web-vitals library that is supported is TTFB, which we see as primarily valuable as a diagnostic of LCP.

Of course, I'd be happy to provide some more details!

Our goal is to have a single, small script that collects various kinds of performance data, together with other data such as UA string and URL path, as well as other service-level info, in order to enable aggregating performance data in various useful ways.

The data includes:

CWV metrics (including the web-vitals-reported TTFB)
Navigating Timing metrics
A boolean indicating whether the page was served from our edge cache
A boolean indicating whether the visitor was signed in (relevant because of caching concerns)
ECT, downlink, and estimated RTT
Origin and path
Which of our "providers", and which of our "services" served the request (to distinguish between different systems and sub-systems)

ES5-only browsers are generally only expected to report the Navigation Timing data, which is useful on its own for network and server performance analysis; but even if they didn't, just having an awareness of these page visits, as well as the URL, origin, and system that they're hitting, is incredibly helpful in getting an idea of what we don't know. Having a conspicuous hole in the performance data is valuable in and of itself, so we need a mechanism to make that hole as visible and measurable as possible.

I get the motivation to collect usage data from as many visitors as possible. I don't think that's the same use case as wanting to collect performance-specific data, in order to improve the experience for those users.

I wasn't too clear above, so I apologise for the misunderstanding. The data we're collecting is all performance-focused, consisting of either performance metrics or other markers that help us aggregate that performance data in different ways. This system doesn't handle general analytics aspects at all.

It's certainly possible to split the data collection into two scripts (or one script and one server-side logger), but that means either dropping the service-level data and losing the ability to aggregate in that way; or alternatively, duplicating that data collection between the two systems and finding mechanisms for cross-referencing the data across both systems to avoid duplicates.

Having a single, standalone script is incredibly simpler and has been a fairly robust approach so far, which is why we're inclined to continue pursuing it.

I don't believe it's critical (nor is it possible) to do that for all users

The main goal isn't to gather all performance data for all users; we know that's impossible, so we try to get just as much as is possible. It's better to know TTFB, Navigation Timing info, and whether the request hit the edge cache, than to know nothing at all — or worse, to be unaware of the visit.

And for that, the main thing we need to ensure is that the script does not fail during parsing. As I'm sure you're aware, it's impossible to defend against unsupported syntax, since parsing happens before execution; which means short of using eval and potentially causing CSP violations, there's no way for a script to guard against its own syntax being unsupported.

which is why even in the initial release of web-vitals we recommended that folks load it using <script type="module">, to avoid requiring older browsers to download a script that couldn't really benefit those users.

There are a few issues with using this approach for differential script loading.

First of all, module scripts come with additional restrictions, particularly around cross-origin concerns. Using a module script would mean moving to serving them in-origin, or alternatively setting up CORS headers for the common origin that currently serves them. Not an insurmountable challenge by any means, but a difficult one to justify on an otherwise working system.

But more importantly, the module / nomodule approach to differential builds is fixed to a particular point in time, which grows increasingly irrelevant as time goes by. It would certainly solve the ES5 issue, yes, but as more and more supported syntax makes its way into the Baseline Widely Available profile and inevitably web-vitals, so too would more and more syntax need to be transpiled to match the module support line in the sand — which I believe is already approximately 4 years old?

Which means that while it would mostly solve our issue now, we would likely end up with similar issues soon enough, given that our transpilation would by necessity be fixed to an immutable profile, while web-vital's would be a moving target in a constantly evolving platform.

I understand that if you want to bundle the web-vitals library into another analytics script you're serving—and you want that script to be able to run in ES5-only browsers—then that will require the entire script to be downleveled to ES5 syntax, which can incur a size penalty.

Again, I wouldn't call it an analytics script; it's entirely performance-focused. Sorry for the confusion. Other than that, yes, that is a correct summary.

However, I'm sure you know that there are ways for site owners to mitigate that cost by conditionally serving only the ES5 syntax to ES5 browsers—and that's the approach recommend.

Unfortunately, in our experience there aren't any good primitives on the web for client-side differential script loading.

I'm sure we agree that the server-side UA-based approaches are fundamentally flawed, so I won't go into detail there.

As for client-side, the module / nomodule approach has the issues I mentioned above.

Other approaches are theoretically possible, but extremely high-maintenance in practice. We could, for example, have a build that matches web-vitals's browser support (Baseline Widely Available), and another one for older browsers than that. In theory, we'd have a preliminary script that checks whether the browser supports Baseline Widely Available, and depending on the result of that check dynamically loads the appropriate script.

However, what exactly would that check be? We can't check for syntax because of the aforementioned reasons (unsupported syntax fails during parsing, and eval often runs afoul of CSP), so we'd need to look for web features that correlate to syntax support instead (e.g. whether something like Array.prototype.flatMap is present). This would mean handcrafting new checks whenever new syntax makes its way into the Baseline Widely Available (or at least into web-vitals), bearing in mind that different browsers adopt new syntax at different times and in different orders.

You could also potentially dynamically import the web-vitals script in a try/catch and only invoke the onXXX() functions if the load succeeds. That way you're reducing the total script load for those users.

There are two mechanisms that you could be referring to here: either real, browser-level dynamic import; or a more classic dynamic import that's implemented through some sort of bundler-provided or custom-coded helper. I'll cover both, since I don't know which one you were referring to.

For the former (real browser import), a try/catch could work as a mechanism for statically identifying browsers without dynamic import support, since in theory they'll complain about import being undefined (even though import is a syntax construct, not an actual function, but of course they don't know that). I'd have to double-check, but I'd expect it to work, yes. However, it wouldn't work as the actual loading mechanism because of the cross-origin concerns I mentioned before. Additionally, a separate mechanism (such as transpilation) would be needed to handle any syntax that falls between dynamic import() support and Baseline Widely Available anyway — a gulf that again, will only grow over time.

As for the latter ("fake" dynamic import via helpers), it would be worth testing. It would have to be callback-based, of course, not a static try-catch, but assuming that the script's error handler is fired and makes its way to a callback somehow (even in a cross-origin context), it could serve as a signal to try loading the other script instead. I'm concerned about the potential fragility, and the general concept of having intentionally failing scripts as a valid codepath, but it's worth a try if there are no better options.

My position (as I've written about recently) is that it should be up to site owners to make browser support decision, rather than JavaScript library authors. I fully acknowledge your specific desire to support ES5 browsers, but this is not something that all site owners need

Of course there are trade offs here, but we believe it's in the best interest of the web ecosystem to keep pushing forward, and we don't think we're pushing too early.

I would agree in principle if we had a differential loading mechanism to match this, but the unfortunate reality is that we don't.

Even in a same-origin scenario, module / nomodule is only useful to decide between two sides of an immutable, arbitrary line in the sand that only grows more distant and irrelevant as time goes by. So while the argument can be made for module / nomodule being sufficient for ES5 vs ES2015, that reasoning cannot be extended to any later thresholds, particularly moving ones.

"Baseline Widely Available" is a build-time concept that to my knowledge has no feasible runtime differential loading mechanism other than "try loading the modern script and if it fails, load the other one" 😕

we do not want to impose additional, unnecessary code on users who do not need it.

Supporting easy transpilation to ES5 does not mean having additional, unnecessary code; just a different coding approach in the library. The only reason why the v5 RC requires additional transpilation helpers when compared to v4 is that existing, working code was replaced with newer syntax (such as for ... of) that essentially does the same thing.

It should be possible to keep developing web-vitals in a similar way as before, i.e., in a way that would result in minimal helpers being needed when transpiling — ES5 is part of Baseline Widely Available, after all!

This is the philosophy we follow in our script: if we develop with ES5 in mind, then there won't be a need for differential builds in the first place, because the transpiled ES5 version itself ends up small (around 4KB gzipped, in our case). Older browsers don't make use of much of these 4KB, but that's okay; we're more concerned with optimising performance in newer browsers than older ones. And while ES2015+ would allow for terser syntax in places, the bundle size difference from changing transpilation targets isn't that big, so there's not much of a downside to running the transpiled ES5 on a modern browser.

Believe me, I do understand the desire to adopt simpler, terser, and in many ways clearer syntax, and not being stuck to a minimum common denominator forever! However, using newer syntax is not an inevitability, and in some cases it makes sense to put in the effort so that transpilation is minimal (e.g. const or let), rather than verbose (for ... of), or incredibly verbose (async / await).

In our RUM script, that choice seems obvious for all of the aforementioned reasons. We imagine we're not alone in this, and that it would be beneficial to web-vitals to easily enable these use-cases without requiring differential or conditional loading architectures, but perhaps we're extrapolating our needs too much? 🙂

sgomes commented 1 day ago

Follow-up: as an experiment, I pulled web-vitals v5 into our source tree, and modified it to avoid needing large transpilation helpers. At least in our case (we don't use all of web-vitals, e.g. attribution), and at least for now, the only changes that were needed were moving from for ... of back to .forEach, and from object spreads back to Object.assign.

With those changes in place, the bundle size is effectively identical to using v4, at only 40 bytes larger. There's also a difference in that this way we're using the TS sources instead of the dist JS, but I don't think that has an impact.

GoogleChrome / web-vitals

v5 browser support tradeoff is problematic for RUM scripts #545