Proposal for V2: occlusion detection

szager-chromium commented 6 years ago

I propose that the IntersectionObserver spec be extended to allow for detection of occlusion by other content.

Here's an early draft of the proposed spec change:

http://szager-chromium.github.io/IntersectionObserver/

Special attention should be paid to this section, which describes the heuristics for occlusion detection:

http://szager-chromium.github.io/IntersectionObserver/#calculate-visibility-algo

The intention with that language is to make it possible to implement the feature efficiently; and to make it maximally useful for the anticipated primary use cases. Here's a small slide deck explaining the motivation and anticipated use cases for this feature:

https://docs.google.com/presentation/d/13-M2eqNKnClEPXiEQK2iwvnk3njqssj4OzDuZCSe_jQ/edit?usp=sharing

chrishtr commented 6 years ago

In terms of efficient implementation, we at Chrome Rendering think it should be relatively straightforward to implement this spec. We're going to start prototyping shortly.

Here is some pseudocode:

bool TestVisibility(element))  {
  if (root viewport transform of element is not identity or translation)
    return false;

  if (any filters or non-1.0 opacity is applied on any ancestor of element)
    return false;
  Rect root_viewport_rect = RootViewportVisualRect(element)

  if (HitTest(root_viewport_rect) != [element])
    return false;
  return true;
}

Rect RootViewportVisualRect(element) {
  Return the bonuding box of: border box rect of |element|, transformed and clipped, in the root frame's viewport space. In other words, a rect that encloses all pixels which are drawn into by the element.
}

ElementList HitTest(root_viewport_rect) {
  Return list of elements that are the hit test results for a hit test at any integer point within the bounds of root_viewport_rect
}

A rect-based HitTest and RootViewportVisualRect are already implemented in Chrome, and should hopefully already be present in other browsers as well. Edge even has a rect-based hit testing API exposed to script (msElementsFromRect) already.

In Chrome, the rect-based hit testing is already needed for gesture disambiguation on Android, for example. RootViewportVisualRect is used for raster invalidation.

ojanvafai commented 6 years ago

Bikeshedding: s/isVisible/isGuaranteedVisible/ to make it more clear what it's actually measuring.

dbaron commented 6 years ago

So I don't think Gecko has rect-based hit-testing code, so we'd need to implement it in order to implement this feature. That makes it seem like a pretty substantial feature from our end. (What other web features does your rect-based hit testing support?)

It would be interesting to hear about the use cases for this. Is this something that sites (ads?) are widely simulating today using JS APIs, in ways that perform poorly? Or is it rarely simulated but more requested, or just requested?

Some of the use cases seem to be about checking visibility for various security use cases. If that's a use case, then it creates a new class of security vulnerability: false positive reports in this API. How confident are you that that space is easy to secure? It seems like there are a lot of web features that can occlude or clip things.

szager-chromium commented 6 years ago

We use rect-based hit testing to figure out the target of touch events.

It's not necessary to implement rect-based hit testing, but it's a convenient path to implementation if it's present. My original implementation idea -- and we may yet return to it -- is to first walk the graphics layers and look for overlap with graphics layers that paint on top of the target element; and then to walk the list of painted objects in the target graphics layer, starting from the target element, and check for overlaps.

The primary motivation is to eliminate common patterns of fraud and abuse on the web, and to enable trust relationships between embedded third-party iframes and their host documents. Specifically, it gives the iframe a strong guarantee that its content is visible on screen, and has not been painted over or altered in any way by the embedding document.

Some of the likely use cases are shown in the slide deck, including:

Click-jacking. Any iframe that receives a click can never be sure that it was visible to the user when it was clicked. Think of embedded 'like' or 'share' or 'sign in with' or 'pay with' widgets.
ad stacking, a form of ad fraud where multiple ads are displayed stacked on top of each other. Only the top-most ad is visible, but all of the advertisers are charged for the impression.

Given the motivation, the idea is that a conforming implementation may sometimes give false negatives (i.e., report a target as "possible occluded" when it's actually visible), but must never give a false positive (i.e., report a target as "definitely visbile" when it's actually occluded). We're pretty sure that's achievable in chromium, but I don't know enough about other browser implementations to know how difficult it would be for them.

szager-chromium commented 6 years ago

I would also add that, to my knowledge, there is no existing functionality that comes close to what V2 would offer. This is in contrast to V1, which mostly offers information that can be had by other means. I believe V2 would be an entirely novel capability.

chrishtr commented 6 years ago

It would be interesting to hear about the use cases for this. Is this something that sites (ads?) are widely simulating today using JS APIs, in ways that perform poorly? Or is it rarely simulated but more requested, or just requested?

We have spoken with engineers who work on ad tracking systems. It is clear that all large ad tracking libraries today try to detect occlusion and other click jacking techniques. In addition, there is almost certainly an arms race between them and those who try to manipulate the system.

For these reasons, even with IOv1, ad trackers still need to run a setTimeout polling operation in order to try to detect occlusion. And in the setTimeout callback they run a lot of expensive code. One example we have seen is that they use the elementFromPoint API at the center point of an ad to simulate a hit test. (This is both expensive and also inaccurate because it can't cross non-same-origin frame boundaries).

Ad tracking systems which perform such a hit test will of course only get faster if they use the built-in hit testing we plan to implement in IOv2, which we can highly optimize (as szager mentioned), and run at convenient times that don't impact UX, such as even on different threads or processes if necessary.

dakami commented 6 years ago

I can confirm with absolute confidence that the clickjacking problem is a mess, there's no existing way to solve it, and people are deploying terrible code all the time in an attempt to try. I ended up getting pretty involved in the subject, building an early PoC to show something could be done graphically to address the issue (code at https://github.com/dakami/ironframe , slides at https://dankaminsky.com/2015/08/09/defcon-23-lets-end-clickjacking/ ).

This is very cool.

jakearchibald commented 5 years ago

For others following this issue: the spec is currently at http://w3c.github.io/IntersectionObserver/v2/

szager-chromium commented 5 years ago

The V2 spec proposal shipped in chromium earlier this year:

https://chromestatus.com/features/5878481493688320

zcorpan commented 1 year ago

AFAIK there still isn't a spec for hit testing (see https://w3c.github.io/csswg-drafts/css-ui/#issue-bdab65a4), although https://w3c.github.io/csswg-drafts/cssom-view/#dom-document-elementsfrompoint exists which depends on hit testing and is implemented across the board https://caniuse.com/mdn-api_document_elementsfrompoint

zcorpan commented 1 year ago

Hmm however elementsFromPoint only does point-based hit testing, not rect-based hit testing.

chrishtr commented 1 year ago

AFAIK there still isn't a spec for hit testing (see https://w3c.github.io/csswg-drafts/css-ui/#issue-bdab65a4), although https://w3c.github.io/csswg-drafts/cssom-view/#dom-document-elementsfrompoint exists which depends on hit testing and is implemented across the board https://caniuse.com/mdn-api_document_elementsfrompoint

You are right that there is no definition of hit testing yet in a specification. Likewise, there is not yet a precise definition in a spec of ink overflow (see w3c/csswg-drafts#8649). Previous discussions of IntersectionObserver occlusion testing have blocked on this problem, because making it precisely defined impacts interoperability.

If we can agree on the usefulness of occlusion detection and the definition in the "v2" IntersectionObserver spec, then we can go and define hit testing and visual overflow elsewhere as needed in order to support specifying occlusion.

Hmm however elementsFromPoint only does point-based hit testing, not rect-based hit testing.

Yes. However, even before this feature, Chromium already needed to do rect-based hit testing for the purpose of mobile hit testing, where user touches are not precise enough to use a single point. The definition of the rect-based hit test is a simple generalization of a point-based hit test.

szager-chromium commented 1 year ago

Note that the proposed spec doesn't mention hit testing; it says: "If the implementation cannot guarantee that the target is completely unoccluded by other page content, [set isVisible to] false."

We found it convenient to use the existing hit testing code, with some modifications, to implement this functionality in chromium. But occlusion detection requires somewhat different behavior from the hit-testing used for elementsFromPoint and event targeting, so even if there were a spec for hit testing, it would need exceptions.

chrishtr commented 1 year ago

The WG held a meeting to discuss this issue this week.

Outcomes:

Continue to discuss the occlusion feature, but genrerally positive reception so far.
Implementation notes:
- Should make sure that the delay feature doesn't leave sites in a persistent stale state without up-to-date notifications
- delay should be broken out into an feature that is of independent value

Notes


Stefan: Designed this feature and talked to a lot of people about security concerns and use cases. Have some strong evidence from Google partners that it works well for security purposes. Used on about 5% of page loads in Chrome. Feedback received from Google-internal teams is quite positive about the feature in practice.
Stefan: one use case is a sign-in widget in an iframe on an untrusted site, without having a popup window or other types of interstitial.
Emilio: is the use case to guarantee that nothing at all is on top of the element?
Stefan: nothing painted on top at all, including blurs and non-zero opacity. Does not block a malicious site from taking events, but they can’t draw on top.
Stefan: there are some cases where the iframe is actually visible but for some reasons the site uses an innocuous type of rounded corner etc.
Stefan: spec is biased towards guaranteeing no false positives (i.e. “reported visible but is actually occluded”). There are some corner case implementation-dependent things where browser interop might not happen at present, such as precisely defining blur radius of a box shadow.
Chris: is blur radius or filter radius the only non-interoperable issue?
Stefan: drop shadows also.
Chris: if those are the only cases, then I think it’s doable to specify exactly.
Emilio: choice of fonts can vary by OS or browser, so the outcome may be different on the same page for different browsers.
Chris: if sites really want to get the same output, they can use a web font / override ascents & descents. 
Emilio: have not heard demand for this feature outside of Google teams. Probably not opposed to implementing this, but is there more demand from elsewhere?
Stefan: occlusion testing is a genuinely novel feature that cannot be computed in any other ways (as opposed to v1, which is polyfillable)
Stefan: hit testing is expensive compared with other parts of IO, and so it’s performance-positive to avoid work through integration with that feature.
Chris: we’ve also been able to tune the implementation over time to throttle cases where IO was using up too much CPU time in the aggregate.
Chris: Emilio and Simon, does all this sound compelling?
Emilio: would be good to demonstrate more developer demand for it.
Stefan: have heard from external developers who want the feature, will pass along the feedback.
Stefan: there is a “delay” feature for how often measurements happen, and if occlusion testing is on we ignore values less than 100ms. Think it would be good to add the delay feature independent of occlusion testing, since that allows developers to tune their own performance.
Emilio: what about cases where delay causes a persistent error state due to delay?
Chris: let’s split out delay into its own separate feature request from occlusion testing.

szager-chromium commented 10 months ago

Updates...

This feature is currently being used on ~6% of page loads in Chrome, and by ~8% of top websites, which is I think a pretty strong indicator of web developer interest, especially considering the feature is only available in chromium-based browsers. It was proposed as an Interop 2024 goal (but not adopted).

The biggest implementation concern raised at prior meetings was the challenge of making this feature interoperable without a detailed specification of ink overflow. Hoping to make headway on that issue...

szager-chromium commented 6 months ago

I've attempted to address the interop concerns with a few PR's that specify the extent of ink overflow:

https://github.com/w3c/csswg-drafts/pull/9823 https://github.com/w3c/csswg-drafts/pull/9824 https://github.com/w3c/csswg-drafts/pull/9842 https://github.com/w3c/csswg-drafts/pull/10085

The first two PR's have been merged. There is currently discussion on csswg-drafts#8649 about whether the last two should be accepted; we hope to get resolution at next week's CSSWG meeting.

I'd like to ask for feedback from the participants on this issue as to whether those PR's adequately allay the interop concerns around relying on ink overflow to determine occlusion. I've created a PR that adds the V2 feature set to the spec, and I'd like to move forward with it if there is agreement.

szager-chromium commented 4 months ago

Members of the WebApps WG met on May 9 to discuss this proposal. No significant objections were raised, and @emilio agreed to review the spec PR and work towards getting a formal standards position from Mozilla. A few specific topics that were discussed:

ink overflow: the only real area of concern is ink overflow from text and text decorations; all other sources of ink overflow are trivially interoperable. Because there is already significant variation in text rendering between implementations, it's difficult to make absolute guarantees about interoperability. However, the potential for non-interoperable behavior in practice seems to be quite small, and this is supported by our experience with chromium's implementation of the feature.
testing: We need a robust set of WPT to ensure consistent behavior, especially with respect to ink overflow. There is already some coverage, including some tests that target ink overlow, but we anticipate adding more.
implementation details: We discussed chromium's implementation of the feature, which relies on the same hit-testing code used to target input events. Chromium supports hit testing a rectangle rather than a single point; returning an ordered list of hit targets rather than a single one; and using ink overflow rectangles rather than layout rectangle for determining overlap. @emilio thought it would be feasible to do something similar in gecko.
process-isolated iframes: in chromium's implementation, if any part of a process-isolated iframe is occluded, then all content inside the iframe is treated as potentially occluded. This was a practical decision in the chromium implementation, and falls within the proposed spec's allowance for false negatives. We all thought it likely that other implementations will use the same approach, for the same practical reasons; but should other implementations behave differently, it could be a source of non-interoperability. There is an existing test for occlusion of a cross-origin iframe, but more coverage here is a good idea.

w3c / IntersectionObserver

Proposal for V2: occlusion detection #295