servo / webrender

A GPU-based renderer for the web
https://doc.servo.org/webrender/
Mozilla Public License 2.0
3.12k stars 276 forks source link

Lift FastTransform to the type level #2869

Open kvark opened 6 years ago

kvark commented 6 years ago

Don't we already know at compile time where the transformations are taken relative to the reference frame (and thus can just be offset), or they are absolute ones (have to be full 4x4 matrices)?

mrobinson commented 6 years ago

@kvark I don't think it's possible to know this at compile time and to preserve the optimization.

Take a look at: https://github.com/servo/webrender/blob/311c20eaa156a4e0816e2d9db8469d3e30b67c25/webrender/src/util.rs#L476-L483

We use this to convert resolved LayoutTransforms into FastTransforms.

kvark commented 6 years ago

@mrobinson I assume you mean this piece of logic:

        let source_transform = scene_properties.resolve_layout_transform(&info.source_transform);
        info.resolved_transform =
            LayoutFastTransform::with_vector(info.origin_in_parent_reference_frame)
            .pre_mul(&source_transform.into())
            .pre_mul(&info.source_perspective);

I believe we have a much fewer reference frames than nodes, and my proposal is:

This would make our control flow more deterministic (not witching between offsets and matrices during something like an animation playing), debugging easier (one less layer to go through), types stronger, etc. The cost, it seems, is slightly less efficient handling of axis-aligned reference frames. Am I missing some caveat here?

mrobinson commented 6 years ago

Since transforms are propagated downward into the tree, each spatial node would be holding at least one very expensive transform and inverse to get back to screen space. In addition, reference frames are still likely to be 2d translations. As a demonstration, you can see that a large number of pages are using the 2d translation functions on the IE web platform feature stats page [1].

It is true that compatible coordinate systems now no longer include axis-aligned rotations, so it would be possible to store ClipNode::coordinate_system_relative_transform as a vector.

  1. https://developer.microsoft.com/en-us/microsoft-edge/platform/usage/css/transform/
kvark commented 6 years ago

each spatial node would be holding at least one very expensive transform and inverse to get back to screen space

Not necessary each spatial node, but only the reference frame nodes.

As a demonstration, you can see that a large number of pages are using the 2d translation functions on the IE web platform feature stats page [1].

Thank you for the link! I wonder about how common this is. I'd be fine with this sacrifice if we aren't bottle-necked on the inverse transforms there, given that code clarity it brings on the table.

gw3583 commented 6 years ago

I like the idea of this.

I think the number of reference frames is typically small enough that even though this might be a bit more expensive, the code clarity is worth it. It's possible it may even work out faster due to fewer branches.

I think this was previously something that showed up in profiles because we were doing it for every clip node, even though they don't introduce any new positioning information? With the transform palette changes, I benchmarked a number of sites - the typical number of reference frames on most pages is 5 - 15. On some very complex pages I was seeing ~30 reference frames, but this is rare and probably still small enough to make this change worthwhile (similar pages would have > 250 clip nodes, which made this optimization very helpful when we were dealing with matrices for all of those nodes).

mrobinson commented 6 years ago

Not necessary each spatial node, but only the reference frame nodes.

Each spatial node has a transform that is calculated based on its parent node, so the first reference frame "infects" the tree. Are you imagining that non-referenence frames would hold both the reference frame transformation/inverse and some offset?

Also, if each node holds either a real transform or one of the fast ones, doesn't that mean we still have to branch to get to the real transformation?

kvark commented 6 years ago

@mrobinson

Each spatial node has a transform that is calculated based on its parent node, so the first reference frame "infects" the tree.

Oh I see. It sounds like this is already a case for the current implementation, but only the non-2D transformations infect the tree.

Are you imagining that non-referenence frames would hold both the reference frame transformation/inverse and some offset?

Yes, that's the logical step to do. We could have a much smaller number of inverse transforms then, at the cost of adding (or subtracting - when applying the reverse) the offset to a reference frame each time we need to go from local to world positions. So, naturally, my question would then be: how many times, roughly, we apply a node transformation (and the inverse) for an average node? In an extreme 1:1 case, for example, it's clear keeping an offset to the reference frame would be beneficial.

Also, if each node holds either a real transform or one of the fast ones, doesn't that mean we still have to branch to get to the real transformation?

My understanding was that we already have this info at compile time: different node types for scrolling/sticky and reference frames, so there needs to be no run-time branch.

mrobinson commented 6 years ago

My understanding was that we already have this info at compile time: different node types for scrolling/sticky and reference frames, so there needs to be no run-time branch.

At some point every node is treated the same way in order to fill out the TransformPalette, we would need to branch at this point to get the real transformation.

kvark commented 6 years ago

@mrobinson ah, ok. So this would be a once per node branch then, as opposed to once per any use of node.