GoogleChrome / lighthouse

Automated auditing, performance metrics, and best practices for the web.
https://developer.chrome.com/docs/lighthouse/overview/
Apache License 2.0
28.27k stars 9.35k forks source link

Unexpected changes in main thread time reported around March 8th #15896

Open AntonioGargaro opened 6 months ago

AntonioGargaro commented 6 months ago

Summary

Hey all, I have been investigating the numbers reported by Pagespeed Insights API (therefore Lighthouse) for my company's SDK. We can see a massive spike in the metric from a third-party website that lets us understand changes over time to metrics such as main thread time.

Based on the below, has a change been released to PageSpeed Insights that could explain these differences? I believe the v11.6.0 would have been deployed around that date, so could this contribute to these changes in reporting?

I appreciate Econify is a third-party platform, but we have confirmed that they use PageSpeed Insights API under the hood.

Screenshot 2024-03-27 at 10 04 39 https://www.econify.com/performance/vendor/permutive-app

Interestingly enough, the main thread time varies substantially between a few other providers too, where some improve drastically and others appear to worsen.

Screenshot 2024-03-27 at 10 09 37 https://www.econify.com/performance/vendor/contextweb?date=Mar+26%2C+2024&device=mobile&type=article&range=1m

Screenshot 2024-03-27 at 10 13 04 https://www.econify.com/performance/vendor/google-analytics

More vendors:

AntonioGargaro commented 6 months ago

Following up a little more on this investigation, I have profiled https://nypost.com with Lighthouse, where the report seems to attribute a longest-running task to our script which doesn't seem to make sense to me. I have uploaded the assets here for local inspection. I have added a video to the drive too running through what I'm seeing that isn't making sense.

From a commercial perspective at Permutive, this is causing upset with our customers around the performance of our script, which we haven't been able to identify yet internally with our metrics or profiling of publisher sites. I hope we can identify a change somewhere that may be affecting the attribution of blocking time to our script erroneously, or at the very least, validate how to inspect these profiles and reports correctly!

AntonioGargaro commented 6 months ago

Noting that this Chromium report also seems to be describing stacked tasks when were not present before Chromium v122.

https://issues.chromium.org/issues/329678173

A theory based on this is that other scripts' evaluation time is being attributed to our SDK instead of their own evaluation, which may explain why we see such a drop in main-thread time for them and an increase in ours.

benschwarz commented 6 months ago

We have run experiments behind the scenes at Calibre and were able to observe Total Blocking Time (TBT) increases between Chrome versions. It seems that Lighthouse with Chrome 122 (& 123) reliably reports higher TBT than previously observed on 120 or 121.

Here’s what we saw:

adamraine commented 6 months ago

Looking into this. I don't think we updated the Lighthouse version around March 8 (PSI is currently on 11.5.0). So I think it's more likely to be a performance regression in Chrome as @benschwarz's investigation seems to indicate.

I'm going to try bisecting this issue in Chrome. If ya'll could provide several specific URLs that showed a clear performance regression that would be super helpful in investigating this problem further.

benschwarz commented 6 months ago

@adamraine Looking at our historic metrics, I didn't see any notable change from LH 11.4.0 to 11.6.0, the change appeared to be purely Chrome based.

AntonioGargaro commented 6 months ago

@adamraine The reference URL we have been using for NY Post is https://nypost.com/2017/05/10/walt-disneys-original-disneyland-map-could-sell-for-1m/ which is the same URL Econify is reporting increases on.

We also noticed this jump in TBT in Calibre for https://www.businessinsider.com.

Screenshot 2024-03-28 at 09 41 59

I believe this TBT is likely caused by the regression in Chrome, where it is nesting macrotasks under other macrotasks. This is likely why TBT is the obvious increase as what were small tasks are becoming long tasks.

joshdifabio commented 6 months ago

Hi folks, thanks for your efforts on this.

I just want to reiterate Toni's point that the root cause of this issue appears to be the following regression in Chromium: https://issues.chromium.org/issues/329678173.

Furthermore, the Chromium bug appears to have been incorrectly triaged as not being a regression, which appears to have lessened its priority. If others agree, then perhaps some further encouragement on the Chromium bug that this is a regression would be valuable.

In terms of replication and bisecting the issue in Chrome; I think the thing to look for is stacked macrotasks in the Chrome performance profiler (i.e. multiple concurrent grey rows), examples of which can be seen in the OP of the Chromium bug. I suspect that any build exhibiting stacked macrotasks in the Chrome performance profiler will exhibit the spurious main thread measurements.

benschwarz commented 6 months ago

Furthermore, the Chromium bug appears to have been incorrectly triaged as not being a regression, which appears to have lessened its priority. If others agree, then perhaps some further encouragement on the Chromium bug that this is a regression would be valuable.

Yes, I agree. Having spent several days investigating on my side, I believe the Chromium bug to be a clear-cut regression. In testing before the issue (Chrome 120, 121) and after (Chrome 122, 123) we've seen a clear rise of TBT measurement (and importantly, not TTI). Tasks are up to 2X longer in a lot of the cases I've observed, which aligns with the report of stacked micro tasks.

I've shared some of my findings with the Lighthouse team privately and have also posted on the Chromium issue.

AntonioGargaro commented 5 months ago

Hey @adamraine, I've noticed the fixed has been released. Do you know when this will make it into Pagespeed Insights?

connorjclark commented 5 months ago

12.0 should be in PSI sometime early next week.