GoogleChrome / web-vitals

Essential metrics for a healthy site.
https://web.dev/vitals
Apache License 2.0
7.47k stars 410 forks source link

LoAF missed for some INP events #485

Closed brendankenny closed 3 months ago

brendankenny commented 3 months ago

I ran across this issue while playing with the demo for the Measuring INP codelab. After trying to find a minimal repro, it still depends on a number of very specific circumstances, but

so it still seems worth fixing.

tl;dr

LoAFs can sometimes arrive from the PerfObserver before the interaction events they contain. If a requestIdleCallback fires between the PerfObserver callbacks, a LoAF can be discarded and won't be able to be paired with the interaction events when they come in later.

From out-of-band discussion, a reasonable fix would be to not discard any LoAFs that have a startTime after any existing event's processingEnd (or startTime + duration?). I'll open a PR to address this.


This turned out to be relatively simple, but I spent a decent amount of time debugging because it didn't appear to be simple at first, so I'm going to write down way too much about it for posterity. Feel free to skip the rest :)

To reproduce

The demo does two things:

Things that are necessary but I don't totally understand:

Behavior

Usually two INP events will be logged to console (reportAllChanges is set to true). On my machine:

I can very reliably reproduce the second case, which is why I noticed this in the first place.

Looking into what's happening, it appears that

Example run of bug This can be difficult to debug because recording via the performance panel and even console logging seem to affect when the `whenIdle` callbacks fire. I ended up using a global array to push log messages to and then logged everything at once at the end. Here's a trace with the different PerformanceObserver callbacks added as performance marks and the LoAF as a measure: ![annotated DevTools trace showing order of bug](https://github.com/GoogleChrome/web-vitals/assets/316891/490d043c-bbe5-4c6b-8338-e3e56f59a700) https://trace.cafe/t/Q3usrDo9C7 Console output: - new events (at 1930.9): | name | startTime | processingStart | processingEnd | end | duration | -- | -- | -- | -- | -- | -- | `'pointerdown'` | 1813.5 | 1834.6 | 1834.6 | 1837.5 | 24 - new LoAFs (at 2354.3): | startTime | end | duration | -- | -- | -- | 1935.3 | 2347.3 | 412 - LoAFs discarded (at 2356.9): | startTime | end | duration | -- | -- | -- | 1935.3 | 2347.3 | 412 - web-vitals: INP of 24ms - new events (at 2371.8): | name | startTime | processingStart | processingEnd | end | duration | -- | -- | -- | -- | -- | -- | `'mouseover'` | 1909.4 | 1941.1 | 1941.1 | 2357.4 | 448 | `'mousedown'` | 1909.4 | 1942.1 | 1942.1 | 2357.4 | 448 | `'mouseup'` | 1909.4 | 1943 | 1943 | 2357.4 | 448 | `'click'` | 1909.4 | 1943 | 1943.9 | 2357.4 | 448 - web-vitals: INP of 448ms
mmocny commented 3 months ago

Update: we discovered that touch emulation in DevTools on desktop requires some extra IPC hops between Renderer and Browser, and that this can lead to events being processed in a different LoAF frame than if it were real input processing (essentially: the browser adds an extra yield() point which then often schedules after-next-paint).

We wanted to check if that could be related to this issue -- did you have a chance?

mmocny commented 3 months ago

Re-reading the initial description, I thought I should point out:

(After the feedback we got from the folks working on v4 + attribution, we are working towards cleaning up the event timing stream to remove the out-of-order reporting complexity)

mmocny commented 3 months ago

I was finally able to reproduce your test page by NOT turning on 4x throttling (and now its fairly reliable).

I did it many times with 4x throttling enabled and kept getting the other case you mentioned, with 2x events each with a 1 LoAF. (I also at least once was a single event entry). My machine is pretty slow already so I guess too much throttling is as bad as none at all ? ;)

philipwalton commented 3 months ago

(After the feedback we got from the folks working on v4 + attribution, we are working towards cleaning up the event timing stream to remove the out-of-order reporting complexity)

🎉

philipwalton commented 3 months ago
  • Sometimes there will be two identical INPs logged. What's actually happening is two different INPs are reported from the unattributed onINP, but the whenIdle on the callback in the attributed onINP is causing the first report to not be made (and logged to console) until after the metric has been updated for the second report callback. Since it's the same object being updated and passed to the two callbacks, the same values are logged twice. In this case the attribution will have a LoAF, though, so everything is good (I don't know if it's worth doing anything about the slightly confusing apparent double logging...maybe debounce when reportAllChanges is true? Abort the idle task when a new report comes in? Or is that not "report all changes" anymore?)

I thought about this a bit, and I think that's not "report all changes" anymore (as you say).

One way to handle this would be to not report until after all entries in a given PerformanceObserver callback were processed, and that would work well for the INP case where you're observing as the user is interacting. But it would NOT work well for the buffered case where the library is loaded late and processes a bunch of entries at the same time (like is common with LCP). In those cases you'd lose the changes, so I think it's best to be consistent.

tunetheweb commented 3 months ago

I thought about this a bit, and I think that's not "report all changes" anymore (as you say).

Hmmm… not entirely sure about that. For a start it’s not "report all changes" anyway, it’s “report all metric changes”. I.e. it doesn’t report candidates that are smaller than the current metric (a point of confusion often raised).

To me this isn’t a change either since it’s reporting the same value. So it’s kinda pointless and confusing. I’ve seen this issue with web-vitals extension and been meaning to figure it out. I presumed it was an issue with the extension.

So ideally we wouldn’t do this. Maybe by noting when reporting a value that’s already been reported (and losing the “change”). Or by cloning the metric at observation time if it’s liable to change.

Them again I think reportAllChanges is kinda niche anyway, so wouldn’t want to introduce too much more complexity

philipwalton commented 3 months ago

Hmmm… not entirely sure about that. For a start it’s not "report all changes" anyway, it’s “report all metric changes”. I.e. it doesn’t report candidates that are smaller than the current metric (a point of confusion often raised).

My understanding of the situation is that there IS a metric change, but by the time the attribution callback is run (since it's behind an idle callback) the values on the respective metric objects have updated and now appear to be the same when they're reported.

My comment above was meant to say that if there actually is a metric value change, then (to be consistent with how this work for other metrics) we should fire two callbacks, even if they're part of the same interaction.

My hope is that we can eventually remove the idle callback once Michal and team implement the buffering behavior he describes, and so then this would no longer be an issue.

ekremney commented 3 months ago

could be related: https://github.com/GoogleChrome/web-vitals/issues/491

philipwalton commented 3 months ago

My hope is that we can eventually remove the idle callback once Michal and team implement the buffering behavior he describes, and so then this would no longer be an issue.

Another option would be to move the whenIdle() call into the main onINP() function so that attribution logic all still happens sync. This should remove any double reporting. @tunetheweb WDYT?