Clarify use-cases and motivation against rAF polling

igrigorik commented 8 years ago

rAF polling (RAFP) proposal by @pp-koch:

The basic idea is very simple: repeat requestAnimationFrame calls during one second, count how many times it’s called, and use the result to draw conclusions on the current performance of your website in your user’s browser. If the FPS (frames per second) rate starts to go down the browser is having more problems with executing your CSS or JavaScript, and it might be advisable to turn off, say, animations, or that one complicated DOM script that rewrites a table all the time.

Related discussion:

https://twitter.com/igrigorik/status/714875557847040001

https://twitter.com/aerotwist/status/714876538387369984

PixelsCommander commented 8 years ago

Used this in https://github.com/PixelsCommander/PerFix few years ago and dropped it by the end. Test results were too sensitive to kind of animation you perform: JS/CSS, canvas/dom etc. and this is easy to understand why...

igrigorik commented 8 years ago

A quick summary of the points raised in the above exchange (because twitter threading is full of fail):

@paullewis @jakearchibald both highlighted that raF polling only captures the main thread; it does not account for compositor driven frames (e.g. animations, scrolling, etc).
- Earlier drafts of FT had a compositor interface and metrics, but were removed in https://github.com/w3c/frame-timing/pull/48#issue-107599309 -- see second bullet point for an explanation behind the change.
Assuming compositor events are out of the equation, why do we need Frame Timing?
- (+) rAF polling gets us pretty far and is already available to all applications.
- (-) every interested party/script has to set up own polling loop, instead of listening to notifications
- FT notifications can be delivered during idle time and be moved out of critical path
- (-) even if the rAF polling function is "fast", if used continuously, they prevent requestIdleCallback from scheduling long idle blocks (i.e. blocks that exceed a single frame); similarly, lack of long idle-blocks prevents the browser from scheduling other long running tasks (e.g. longer GC's and other background or speculative work)
- if polling is toggled on/off, then you'll miss cases where frame budget has been exceeded
- (-) polling will report false-positives in cases where UA is using variable update frequency - e.g. due to power, visibility status, etc.

Other thoughts, or comments?

pp-koch commented 8 years ago

The biggest advantage to me is that RAF is very easy to explain. People would get the point instinctively. If, later on, we find that something else is even better suited, we can just swap it in. People will be used to such detections by then, and just accept it. (We might be stuck with FPS as a unit, but that's not a really serious problem. It's good enough.)

Please explain why every script would have to set up its own polling. I envisioned my script as a generic one that just keeps track of current performance, and I currently do not see the need for more than one polling script. I added a simple note-keeping function so that users can keep track of what's going on - I hope that's going to be enough.

Good point about requestIdleCallback - but since it doesn't appear to be terribly well supported at the moment, I don't see it as a huge problem. Good point about power, screens turning off, etc. What happens seems to be browser-dependent. Needs more study - or maybe a logic branch that rejects extreme FPS changes.

About the complicated stuff about the compositor: My animation test didn't appreciably hamper the FPS rate, so they're likely to be right. But if it doesn't hamper perceived performance, everything's fine anyway. If compositor threads, for whatever reason, start to hamper perceived performance, something else (a complicated script, most likely) starts to slow down, and that would be picked up by RAFP (I hope). So I don't see the compositor argument as very compelling. It's likely caused by too much focus on RAF's original purpose instead of what I'm actually trying to do with it.

It could be that I'm wrong in dismissing the compositor issues, but in that case I'd like to see a test case where my script clearly yields the wrong results so that I can better understand the issues.

And we just have to try stuff. If it turns out we can achieve the goal better through a slightly different method, let's do it. It just seems to me that, right now, in the current crop of browsers, RAF offers the best possibility of starting some serious, and easily explained, performance measuring.

spite commented 8 years ago

Here's a chrome extension WIP to inject a performance monitor based on rStats: https://github.com/spite/rstats-extension, if you want to try it on different pages and see how different events affect the values in rAF. A couple of recorded examples here and here

It tracks simple values out of rAF (time between calls, framerate and time in call -which is in the minimum measurable because it's running on an empty rAF loop)

It's based on the dev branch, so it's missing many features from rStats, which include threshold values for warnings. It'd be easy to hook those to an event handler. It can also track memory values where performance.memory is available.

For canvas-based animations, using rAF is a useful approach to identify bottlenecks. It's true that sometimes the browser is waiting for the GPU to end execution, but since there's no way of measuring that -yet-, this is probably the best we have for the time being, and across all browsers.

igrigorik commented 8 years ago

Please explain why every script would have to set up its own polling. I envisioned my script as a generic one that just keeps track of current performance, and I currently do not see the need for more than one polling script.

As in, your own code only needs one instance. But, other dependencies and widgets (social, analytics, ads, etc) can't assume that the interface is available and—if this becomes a desired feature—would have to setup own instance of such mechanism. Once that's in play, we have a whole lot of raf's firing to collect timestamps.

Good point about requestIdleCallback - but since it doesn't appear to be terribly well supported at the moment, I don't see it as a huge problem. Good point about power, screens turning off, etc.

I don't think that's a sound argument for promoting an anti-pattern. I suspect we'll see rIC find its way into more browsers very soon.

And we just have to try stuff. If it turns out we can achieve the goal better through a slightly different method, let's do it. It just seems to me that, right now, in the current crop of browsers, RAF offers the best possibility of starting some serious, and easily explained, performance measuring.

Absolutely, I'm not trying to discourage experimentation. If anything, I'm trying to clarify in my own where and how FT fits into this picture. And, actually, what would be really interesting.. is to investigate if and how we can polyfill FT: fallback to rAF polling where there is no better option, but where available, use the native implementation to avoid the pitfalls we described above.

@spite nice, thanks for the pointer! I'd love to figure out what we need to expose to enable a reasonable FT polyfill..

spite commented 8 years ago

It feels like it's a behaviour too deep into Blink/Gecko/etc to be adequately polyfilled. Frame timing as values exposed by the rendering engine, that's great. Everything else are guesses which are at best accurate under certain conditions: mostly, when doing WebGL animations in which we take a canvas that covers the whole screen and there's no other DOM elements.

Even with that, there's no guarantee of hitting constant 60FPS (due to the nature of how frames are synced in the browser) and there's blocking processes outside the main thread -which can be debugged and explored with WTF or about:tracing or similar tools-.

We've been using widgets like stats.js or rStats to track performance, and identify potential problems, but all the instances -by many different people- to either ballpark the "power" of a device with rAF/setTimeout/etc or to choose a different rendering path disabling things, it's had mixed results, and never generalised as a drop-in solution. It's mostly used in combination with DevTools like Timeline or Audits.

FT can be polyfilled, but there are few issues, in my opinion:

the accumulation of rAF callbacks adds overhead when starting each frame;
the cost of JS calls and measurement precision: here's a case we discussed with @paulirish that shows the offset between the time registered by the Timeline and the time registered by JS code https://twitter.com/thespite/status/710578385144160256. At 16.66/11.11ms, some of this overhead represents a huge chunk of the frame budget

igrigorik commented 8 years ago

@spite that Twitter thread seems to indicate that the difference is ~0.05ms; what am I missing?

Do you have a list, or pointers, to concrete cases where rAF estimates led to "mixed results"?

jakearchibald commented 8 years ago

Taking a step back, I think the things we want to capture are:

Time from specific action to related rendering update
Rendering update average & standard deviation vs maximum update frequency of a specific item over a specific period

In terms of a real example. say the user clicks a hamburger icon, I'm interested in:

Time from the click event to the next render update of the menu element - which would be the first frame of the animation
Frame rate of the slide-in animation of the menu

As a developer I want to know:

Is the first frame of the menu animation more than 100ms after the click event? That indicates a slow response. Note that I care specifically about the menu, an unrelated graphical update (like the next frame of a gif on the page) isn't interesting here
Is the slide-in of the menu happening at an acceptable frame rate? 30fps is fine on a 30hz device. 30fps on a 60fps device might not be fine. 50fps on a 100hz device would also be fine. A standard deviation should give me an indication of jank (I think). Once again I'm interested in the menu specifically - if this animation prevents a gif updating, I'm fine with that.

The ultimate goal here is knowing when pixels become available to the user's eyeballs in relation to another event (eg input or previous frame).

One thing I haven't captured here is being able to detect scroll jank, not sure what's the best thing to do there.

igrigorik commented 8 years ago

Taking a step back, I think the things we want to capture are:

Time from specific action to related rendering update

Rendering update average & standard deviation vs maximum update frequency of a specific item over a specific period

Correct me if I'm wrong, but today we don't have this capability (tracing when pixels were painted for a particular component) even within local profiling tools? It's not clear to me where this is actually technically feasible in a cross-platform way and across all the browsers.

... The ultimate goal here is knowing when pixels become available to the user's eyeballs in relation to another event (eg input or previous frame).

Sadly, I don't think we can answer this via RUM. Not for technical reasons, but due to various privacy and security restrictions that we need to account for when exposing this data at runtime (as opposed to a local profiling environment) -- e.g. practically speaking, we can't give you direct feedback on how much time passed between event -> element paint because this opens up all kinds of gnarly attacks. It is for this reason why the current draft of FT exposes slow-only frames and doesn't tell you what caused it.

What we can expose, is the fact that the user is seeing slow frames.

jakearchibald commented 8 years ago

Correct me if I'm wrong, but today we don't have this capability (tracing when pixels were painted for a particular component) even within local profiling tools?

Hm, maybe. Paging @paullewis.

practically speaking, we can't give you direct feedback on how much time passed between event -> element paint because this opens up all kinds of gnarly attacks

Aren't these attacks already open?

igrigorik commented 8 years ago

Aren't these attacks already open?

It's not just a question of whether it's possible but also how accurate and quickly it can be executed. Explicit feedback on when a pixel/section was painted can improve accuracy and speed of many attacks... which can be a blocker. That said, this is a nuanced area, so we have to unpack each use case+metric and examine it as we go along.

rachelnabors commented 8 years ago

I thought there WAS a polyfill for this already. Just talked with a conference attendee who had a desperate need for this API but had not thought to do RAF polling. An API specifically intended to indicate a browser's potential--or lack thereof--to animate elements would be welcomed by laymen outside of framework development.

WICG / frame-timing

Clarify use-cases and motivation against rAF polling #53