GoogleChrome / lighthouse

Automated auditing, performance metrics, and best practices for the web.
https://developer.chrome.com/docs/lighthouse/overview/
Apache License 2.0
28.48k stars 9.39k forks source link

Provide additional details on how SpeedIndex is calculated #9204

Open demianrenzulli opened 5 years ago

demianrenzulli commented 5 years ago

Feature request summary

Currently, the information LH provides around SpeedIndex is not very actionable. Some reasons might include:

Example: Testing https://www.autotrader.com.

Test 1: Remote LH on Web.dev (report):

At the moment of writing this the report showed a value of 13s.

In the Filmstrip view, the site goes from a blank page, to almost visually complete, which suggests that, SpeedIndex should be close to FCP (1.6s at the time of this writing), but, as seen, it's not the case.

As a result, the developer doesn't know exactly which part of the page to optimize to improve this metric.

Test 2: LH over WPT:

This LH Report was ran at the same time than the Web.dev one, and shows a value of SpeedIndex of 3s (3X less).

In this case, it's not clear why SpeedIndex has such value (and why it's so different from the one ran simultaneously in Web.dev/measure).

Test 3: Traditional WPT:

The WPT report associated with the previous LH report, shows a SpeedIndex of 14.9s (close to the one on the initial LH report from Web.dev).

The FIlmstrip view, shows that the visual completeness goes from 57% to 100%, due to the permission prompt, which might explain this value.

What is the motivation or use case for changing this?

Developers usually look for ways to improve individual metrics, in order to increase their overall performance score, and lead to better user experiences. The only information about SpeedIndex is in this article, but it's hard to know how to optimize it.

How is this beneficial to Lighthouse?

Adding more clarity, would make the Lighthouse report more actionable, on one of the main metrics taken into account to generate the overall performance score. Also, the community often uses Filmstrip views as a tool to diagnose performance problems (this tweet is a good example of that).

patrickhulce commented 5 years ago

Thank for filing @demianrenzulli great observations and thank you for the links with examples!

Indeed Speed Index is a thorny metric to compute consistently in all environments. There are a few ongoing efforts to investigate improvements and simplifications, but hopefully this explanation will help clear things up.

  1. Lighthouse on web.dev uses simulated throttling which loads the page quickly and logically replays the page under a different connection/device conditions. Because screenshots tend to get smooshed together on faster loads, we rely on weighted re-layouts to account for shifts beyond FCP. The weighted average approach in Speed Index here is aligning with the JS execution that triggers the permission prompt you observed in WPT case 3 which is why those numbers are fairly similar.
  2. Lighthouse on WPT uses applied throttling and computes Speed Index from the screenshots provided by Chrome. The Chrome-provided screenshots do not include the permission prompt and so the speed index is complete at ~FCP.
  3. WPT computes Speed Index using a screen recording of the phone which does capture the permission modal and the corresponding late JS execution which is what pushes it so far beyond case 2 and makes it align more with case 1.
demianrenzulli commented 5 years ago

Hi @patrickhulce thanks for the explanation!

So, from your comments, in cases like this, I understand the location permission prompt might be the main cause for 3X variation in SpeedIndex between the different sites.

I've also noted that, when running LH from the DevTools Audits tab, in most cases, SpeedIndex was close to FCP. In those tests, the permission prompt was shown as part of the main browser screen (outside of the mobile simulated viewport), so I assume, it's something similar to what's happening in Lighthouse on WPT (test "2").

Some follow-up questions:

In particular, since companies usually request the location permission at page load time it can hardly be delayed. In other cases (like the notification permission prompt), developers can implement things like the double-permission pattern, and only show it under certain circumstances.

I'm afraid that, in cases like these it might look like LH is not taking into account some particular business cases when producing a score.

Thanks in advance.

patrickhulce commented 5 years ago

Are there any exceptions that the tool could make on similar scenarios

The definition of Speed Index is the definition of Speed Index at this point and it's somewhat of a bug that case 2 simply can't be captured through that particular method of screenshot capturing. That being said, we are examining alternative metrics that don't suffer from these particular speed index shortcomings.

would it be possible to, at least, inform the user that the cause might be related to these kind of page components

Are you saying we inform them of the specific visual change that is hurting their score? This is a great idea! Ideally we would, but it's difficult in practice. For example, the problem in case 1 is that we know from JavaScript execution that the impactful visual change happened but the screenshots we have access to don't reflect it, so what would we show the user? Maybe just the timestamp of these late impactful changes?

since companies usually request the location permission at page load time

This is an anti-pattern (we have an audit flagging it in fact 😉), and we're unlikely to create exceptions in performance metrics to support user-hostile page behavior. Despite speed index inconsistencies, I would say the root issue here is appropriately being captured. The page has a lot of JS execution hogging up the main thread (~29 seconds spent in JS) and it takes a very long time until the logic that interrupts the users' experience is triggered. Late user-impacting visual changes are exactly what Speed Index attempts to punish and in 2/3 cases here it's WAI.

LH is not taking into account some particular business cases when producing a score

FWIW, the goal of the performance score is to capture the user-perceived performance. If some current business cases deliver a poorly performing user experience by design (like in the case of an unrequested and very late permission prompt) than it's expected that the score will be lower as a result.

demianrenzulli commented 5 years ago

Thanks @patrickhulce!

I wasn't aware of such audit around location permissions. If that's actually flagged a bad practice, then I guess, it makes more sense that SpeedIndex gets impacted by something like that, as it would with any other major visual ATF change.

I think this explains both the variance in metrics in different tools, and the reasons behind the scores.

Thanks again for taking the time to look at this.

demianrenzulli commented 5 years ago

Hi @patrickhulce, hope you're well.

I'm not sure if it makes sense to re-open this issue just for this, but wanted to leave a comment anyways:

I've seen cases where developers try to map the value of SpeedIndex obtained in LH with the FIlmstrip view that appears in DevTools, when running "Performance", or, more directly, when clicking "View Trace" from the LH report.

It seems like the Filmstrip that LH is taking into account for this value is not the same, but there's some confusion around this.

Is there anyway that LH can express more clear in the UI how this Speed Index is being calculated?

Thanks.

connorjclark commented 5 years ago

Speed index is the visual completeness of a page integrated over time. This gives a great description (and is linked to from the page LH links to): https://sites.google.com/a/webpagetest.org/docs/using-webpagetest/metrics/speed-index

it seems the confusion is around mapping the SpeedIndex to a point on the timeline (Which, btw, doesn't make sense, given the definition above) - is that correct? It's unfortunate that we show s as the unit ... isn't it supposed to be s^2? I'm like 90% sure it should be squared since the measure is the area of a curve :P

I think we could do a better job here. Today you must click three (!) things to discover wth "Speed Index" is. 1- toggle the metric descriptions 2- click Learn more 3- click the correct link really learn more.

demianrenzulli commented 5 years ago

Thanks for jumping in, @connorjclark,

I think your comment describes the perception of the user that I was trying to explain before.

I have the feeling that most of the confusions might come from how the information is presented:

Here's the definition that appears in the "Overview" section of the LH docs:

"Speed Index is a page load performance metric that shows you how quickly the contents of a page are visibly populated."

For those that are actually able to arrive to that link (either through metric descriptions, or by searching in Google, for example), this information could just be enough: making the developer think that the calculation is only based on the amount of time it takes for the visible content to be painted (regardless of how the metric is actually calculated). Then, there's a timeline immediately after the scores, so mapping this definition with the screenshots could also be a possible next step for some users (even when it doesn't show any seconds there, and the actual definition is not intended to do that).

It is true that in the "More information" section of the docs there are links explaining how the metric is actually calculated, but, in my opinion, this looks more like an "appendix", for those that want to "know more", and dig deeper into implementation details.

Probably making a shorter path from metric to definition, and providing more details on that initial summary could help.

patrickhulce commented 5 years ago

Just to add my two cents, I think the root problem here is the complexity of Speed Index metric definition itself. I mean even for some very brilliant engineers who read all the documentation there is on speed index and thought they understood it very well, actually implemented it slightly incorrectly in a way that leads to radically different results and no one noticed for quite some time because the definition is so complicated.

It is excellent at summarizing a very difficult and complicated quality of the page, and correlates extremely well with user perception of page speed, but it truly requires an entire page minimum to explain how it is computed. "how quickly the contents of a page are visibly populated" is a pretty darn accurate short-summary, and I don't think folks will gain that much more value by learning the math behind it. Maybe we could tweak a little to make it clearer that every above-the-fold thing impacts the metric?

I think we could do a better job here. Today you must click three (!) things

Probably making a shorter path from metric to definition, and providing more details on that initial summary could help.

Given the above and our proposed future emphasis on newer, more conceptually simple paint metrics like LCP, I would be hesitant to dedicate too much space in the metrics section of the report just to explaining how Speed Index works.

My preference would be to bring this down to two clicks by moving a real explanation into the learn more doc, and we tweak the help text in the report to whatever folks here think is more descriptive so users know to make all their visual content reach its final state as quickly as possible.

connorjclark commented 5 years ago

That (the docs action) sounds good to me too @patrickhulce.

Any idea about the unit thing I mentioned?

patrickhulce commented 5 years ago

Oh, right!

It's unfortunate that we show s as the unit ... isn't it supposed to be s^2? I'm like 90% sure it should be squared since the measure is the area of a curve :P

Truish, but definitely not s^2. It's the area under a curve with % VIC integrated with respect to time, which is % VIC x Seconds, an essentially useless unit as far as normal people are concerned. Conceptually though if you flip the integral around, it's just the weighted average time at which all content reached its final state. This is commonly expressed in seconds because the % VIC is understood to just be the weight (it's not defined this way because % VIC is not a true mathematical function over the domain of time in every case since % VIC can regress, but in the common case they are equivalent expressions)

% VIC = Percent visually incomplete = 1 - % VC = 1 minus percent visually complete