Canvas hit regions proposal

annevk commented 6 years ago

Here is a sketch and some possible commented out v2 additions. I've tried to come up with the most minimal API that's still extensible for future needs. At this point it seems better to have something than go many more years without. Unfortunately since the discussion around canvas is scattered all over the place I haven't been able to find that much background information and mostly used #1030 to inform myself.

partial interface HTMLCanvasElement {
  [SameObject] readonly attribute HitRegionSet regions;
};

interface HitRegionSet {
  void add(optional AddHitRegion options); // return HitRegion in the future
  void clear();
  void delete(Element target);
  // void delete(HitRegion target);
  // iterable<HitRegion>; IDL only allows this for objects with indexed getters?
};

// [Constructor(HTMLCanvasElement canvas, optional AddHitRegion options)]
// interface HitRegion {
//   [SameObject] readonly attribute HTMLCanvasElement canvas;
//   [SameObject] readonly attribute Element target;
//   Path2d getArea();
// };

dictionary AddHitRegion {
  required Element target;
  required Path2D area;
};

Putting it on the canvas element directly makes the API suitable for 2D and WebGL, but the drawback is that it will be slightly more work for developers as we cannot take advantage of the current path and transformation as the initial proposal did.

The idea is basically that the hit regions are added in order, the last one added becoming topmost. Any UI events for the region go the region's element and do their usual thing from there (e.g., bubbling). This is basically a modification of the non-standardized hit testing algorithm. (I don't want to block this on getting that defined. We'll just use some words, add tests, and hope for the best.)

For the cursor I was thinking to use https://drafts.csswg.org/cssom/#resolved-value on the region's element (i.e., getComputedStyle()).

cc @whatwg/canvas @whatwg/a11y @smaug----

asurkov commented 6 years ago

What is a region's element for? If HitRegion implemented EventTarget interface, then would you need to have a region's element?

junov commented 6 years ago

The element is for accessibility purposes. You can have form controls, for example, in the canvas's fallback content that are associated with hit regions. This allows people with a wide range of disabilities to interact with the the web app (something that canvases are notoriously bad at).

marcysutton commented 6 years ago

What happens if the hit region paths overlap? Or does it not matter for user interactions because the events bubble?

AmeliaBR commented 6 years ago

Assorted thoughts...

Following on the discussion in #1030, I do agree that this keeping track of hit regions and handling event propagation should be a part of the <canvas> element and its DOM children, and not be specific to the context used to draw on it.

However, your proposal is still very specific to Context2D. To make it really useful for WebGL, however, you would need methods to convert from 3D objects to Path2D hit regions (and sort them by z-order) based on the current rendering. I don't think something like that currently exists, and I worry that it could be a performance blocker to create (correct me if I'm wrong on either point).

Would it instead be possible to define the AddHitRegions dictionary in such a way that -- while the set of regions and their associations with DOM elements is a core Canvas feature -- the geometry and z-ordering is defined with an abstract interface that could have multiple implementations? That could leave open the possibility of a WebGL (or other future context) implementation. (But ideally, the basic 2D implementation to be available with WebGL drawing contexts until an integrated 3D method is implemented, so there would need to be some way of specifying which to use...)

For the 2D definition of region geometry, if it's not going to automatically use the current transformation context, it would be helpful to have an option to pass in a transformation matrix (instead of having to generate a copy of the path with the transformation applied).

As you've defined it in the simpler form (without the commented-out bits), the hit region set would seem to be a transient thing that you'd clear and replace completely whenever the layout changed, possibly on every frame.

To create a more persistent object model, in addition to defining the HitRegion object (as defined in the commented-out bits above), you would want methods to re-order the set, and to update the path/transform on the region.

I like the idea of using CSS cursor property for the associated element. In general, I like the idea of tying interactivity to real DOM elements. It will hopefully encourage more use of accessible DOM children of <canvas> to represent the interactive components & text content of a canvas drawing.

But I don't know if implementations currently calculate CSS styles for these elements. Maybe they should. As an author, I can see all sorts of useful cases for using those elements to maintain styles that would coordinate with the rest of the page--whether that's heading fonts or whether that's CSS variables--which you could query & use in your drawing code. And maybe update those styles with interactive pseudoclasses like :focus or :checked. But then, I'd also expect things like :hover and :active to work if that element is associated with a hit region.

Which is to say: I like the idea. But it's probably worth thinking through how far you want to take the idea. It's hard to add just a little bit of CSS functionality and stop there.

It might be useful to have a method to look up a region based on its associated element.

Somewhere, there needs to be defined how the resulting Event objects should be initialized when it comes to x/y positions: relative to the Canvas, or relative to the individual hit region's bounding boxes? In SVG, it is currently poorly specified and inconsistent between browsers. Ideally, the behavior for describing events on canvas hit regions would be spec'd in a consistent way to events on SVG shapes.

Would you be able to define multiple regions associated with the same control element? (That would make getters/delete methods based on the element rather complicated.) I suppose you could always just create a merged path...

Is there any interest in trying to define this interface(s) in a way that also applies to HTML image maps? There are certainly similarities in the idea of having a set of unrendered interactive elements that are associated with pointer events over specific geometries on a rendered graphic. But maybe image maps are just a legacy feature that needs to die now that we have better solutions...

All complexities aside, I really hope that this moves forward.

AmeliaBR commented 6 years ago

@marcysutton

What happens if the hit region paths overlap?

From @annevk's post:

The idea is basically that the hit regions are added in order, the last one added becoming topmost.

I assume from that, the idea is that overlapping regions work the same as overlapping rendered elements: the last/top one grabs the pointer event. (Which is why I mentioned that it would be useful to have methods for re-ordering regions other than clearing everything and re-creating the stack.) Bubbling would be separate, based on the DOM tree.

annevk commented 6 years ago

Thanks @AmeliaBR! Some thoughts:

The path bit probably needs more thought. Associating a transform with the path might be needed for a successful v0 and indeed future versions should make it easier to deal with 3D. Instead of path we could call it pixels or some such as eventually that's what we want. A collection of pixels for which this is the hit region. In any event, even with the current setup it would be possible for a future version to support more options here.
I don't think a single element having multiple hit regions is a problem. That seems fine and shouldn't affect their methods or some such. (Though this does mean that events are all relative to the canvas element's origin, which I think is fine and the simplest approach. I don't think we need to care about SVG here as SVG has an actual object model and therefore different considerations. And image maps are indeed best left to their own devices. I don't see any simplification there though they can likely be polyfilled in terms of this API.)
The canvas element already has a couple of one-off CSS integrations. I suspect using cursor here will therefore not be as much of a slippery slope as you might think. (And note that getComputedStyle() works for any element, even out of a tree.)
And finally, I think you're definitely right that a future version will need more capabilities, including reordering and more precise insertion. What I'm trying to find is the most minimal proposal everyone is comfortable shipping soon so developers can start creating more accessible canvas-based experiences and give us feedback on the pain points. When an initial version has too many features it's highly likely they will not be implemented all, or poorly, or it simply won't come off the ground.

@asurkov the main reason for having a backing element is indeed accessibility and obtaining it in an easy way. We already have all the ARIA machinery and assistive technology integration in place for elements. Replicating that to non-elements would end up being quite a bit more complex and delay this kind of feature a lot or perhaps worse, prevent it from getting off the ground.

asurkov commented 6 years ago

@annevk I see the point: if targeting to a minimal API, then it is perfectly fine to go with Element to address accessibility needs. However, I would like to bring AOM virtual node concept [1] to your attention that may have a good fit for subsequent interactions of the API.

Roughly the idea is the author defines an accessibility object instead dealing with ARIA and DOM, like

hit_reg_set.add(path, { role: 'button', name: 'Enroll!' });

[1] https://wicg.github.io/aom/explainer.html#phase-3-virtual-accessibility-nodes

annevk commented 6 years ago

Cool!

It sounds like we should rename both path and element then to be more future proof. I suggest area to describe a set of pixels (still restricted to Path2D for now) and control to describe something that can convey meaning to AT and is capable of receiving events (restricted to Element for now; can be made to accept AccessibleNode in the future). (target might be a more neutral name as it does not necessarily have to be a control.)

(My above suggestion about needing a transform for the set of pixels can be discarded as Path2D's addPath() already accounts for that.)

smaug---- commented 6 years ago

Looks quite reasonable to me as the v1. add vs. delete mismatch is perhaps something to think. If one can add same element several times (but with different area), why delete doesn't work the same way.

perhaps add and delete could have tiny bit different params void add(HitRegion region); void delete(HitRegionForDelete region);

dictionary HitRegion { required Element target; required Path2D area; };

dictionary HitRegionForDelete { required Element target; Path2D area; };

The delete would delete all the regions mapped to target if area isn't passed.

asurkov commented 6 years ago

Would it be easier if add() returned ID of an inserted region and delete() would take that ID to remove the region? ID could be also used to retrieve the regions back. Otherwise HitRegion object itself could serve as ID, similar to EventListener in addEventListener/removeEventListener.

AmeliaBR commented 6 years ago

If there is a serious possibility of defining HitRegion objects in the future, it seems problematic to overly complicate the add/delete APIs in other ways. In my opinion the sanest options are:

Make the control element a unique key for each hit region; if you add a new region with the same control it either replaces or merges or errors out (to be decided).
Define the HitRegion interface, even for v1, and use it for delete methods.
Replace the delete method in v1 with a clearRegionsFor(element) that deletes all hit regions associated with that control. (A single-region delete method could still be added later if & when a HitRegion interface is added.)

annevk commented 6 years ago

My thinking was that it would be acceptable for delete() to remove multiple regions at once (all those sharing a single target). A future revision could give more detailed control.

minorninth commented 6 years ago

Are there any real-world apps that you think could use this API if it existed?

I like this idea in theory, if there was a good example of a canvas app that needs it. I'm just not sure if this solves the problem for the majority of canvas apps I'm aware of.

For example, Google Spreadsheets currently uses canvas to render the main grid, for performance reasons. To make use of hit regions, it seems like they'd have to clear and recreate hundreds of hit regions every time the users scrolls. It's not clear that there's any benefit in rewriting the code to work this way. Presumably there's already existing code that takes mouse coordinates and computes the hit cell, and I don't see how maintaining all of those hit regions would be easier.

I like that the API is generalized to support WebGL. An obvious candidate for that would be Google Maps. However, I don't think we could provide any sort of API for projecting 3-D objects into 2-D space. WebGL is a very low-level API, it doesn't necessarily "know" the geometry of the 3-D world that's being rendered. Note that game engines already provide hit testing support. I'm not sure what value a hit region API would add - any possible benefit would be outweighed by the complexity of turning complex 3-D objects into 2-D paths and sorting them into the right Z-order. It's also not hard to imagine a scenario where this simplification would return wrong results - a true 3-D ray cast would be the only way to always get the correct object.

For accessibility I think AOM (https://github.com/WICG/aom) is a more general-purpose solution - in particular the virtual node hierarchy proposed in phase 3. It generalizes to custom-drawn views that don't happen to use canvas at all - like the SVG-based layout used by Google Slides or Apple's iWork online.

I think I'd be more convinced by hit regions if I saw a good example of a real-world web app where having the browser implement the hit test would actually make the code significantly simpler and cleaner. If there are some good examples like this, then this APi seems good and I definitely like that accessibility is made easier by this approach.

fr0 commented 5 years ago

Are there any real-world apps that you think could use this API if it existed?

There are a lot more uses for this than just accessibility. For example, if an app wants to show a tooltip based on which canvas "element" the user is hovering their mouse over, the current solution involves keeping track of the location of elements in JavaScript as they're added to the canvas, and then using that to do the hit testing. If the canvas already knows where rendered elements are, then the app could just query that via the hit regions API instead of maintaining its own list.

llgcode commented 4 years ago

For lines, we need to define a precision. So that a rectangle (the cursor position) hit a line, It could be fine to customize the rectangle size, I mean the "sensibility" of intersection detection.

travisleithead commented 3 years ago

I think I'd be more convinced by hit regions if I saw a good example of a real-world web app where having the browser implement the hit test would actually make the code significantly simpler and cleaner. If there are some good examples like this, then this APi seems good and I definitely like that accessibility is made easier by this approach.

My 2c: The Edge team recently reviewed this proposal with our Excel team to see if it might be interesting to them and something we could pick-up. While they were interested, the main reason for having this would be to increase performance of the existing JS code that tracks the hit testing regions (with code simplification and hygiene as side benefits). It wasn't abundantly clear to us that this API would be faster than a vanilla JS implementation. If there are use cases that highlight its benefit, we'd love to explore that in more depth. I believe the cases where this is in use today in Excel is for canvas-based tooltips, cursor feedback when hovering over links (and likely a few other things I've forgotten). Basically code running on each 'pointermove'.

andyearnshaw commented 2 years ago

For accessibility I think AOM (https://github.com/WICG/aom) is a more general-purpose solution - in particular the virtual node hierarchy proposed in phase 3. It generalizes to custom-drawn views that don't happen to use canvas at all - like the SVG-based layout used by Google Slides or Apple's iWork online.

This appears to be no longer the case. AOM phase 3 is no longer planning virtual accessibility nodes due to privacy concerns. The main concern is that they can be used to detect and profile users with accessibility needs. As far as I can tell, canvas hit regions would not expose accessibility tools any more than a standard HTML element could.

whatwg / html

Canvas hit regions proposal #3407