Add support for determining which elements are focusable and tabable

straker commented 7 years ago

Proposal

Add support for determining which elements are focusable and tabable. Probably the most convenient API for developers is through a query selector since we'll need to generate a list of elements. A tree walker node filter would also be acceptable.

Why?

For accessibility reasons this is desperately needed. When implementing an accessible modal it is recommended to auto focus the first focusable element when the modal is opened, as well as trap focus inside the modal.

When a modal dialog opens focus goes to the first focusable item in the dialog. Determining the first focusable item must take into account elements which receive focus by default (form fields and links) as well as items which may have a tabindex attribute with a positive value. If there is no focusable item in the dialog, focus is placed on the dialog container element.

Tab - Focus must be held within the dialog until it is cancelled or submitted. As the user presses tab to move within items in the dialog, pressing tab with focus on the last focusable item in the dialog will move focus back to the first focusable item in the dialog.

However, there is no way to determine what elements are focusable. The current "best" way to do this is using this answer from Stack Overflow, which tries to build up a query selector of known focusable elements (even Polymer uses this approach). However, there is a huge flaw to this approach.

(There's also allyjs, but downloading a 20kb minified & gzipped library just to manage focus is a bit overkill. It also uses a known list of focusable elements, so in the end it's not any different then the Stack Overflow answer).

The flaw is that with custom elements, a known list of focusable elements is no longer possible. Take for example this simple custom element.

<template id="search-element-template">
  <input type="search">
</template>

<script>
(function() {
  var doc = (document._currentScript || document.currentScript).ownerDocument;
  var template = doc.querySelector('#search-element-template');

  customElements.define('search-element', class extends HTMLElement {
    constructor() {
      super();

     this.attachShadow({
        mode: 'open'
      });
    }

    connectedCallback() {
      const temp = document.importNode(template.content, true);
      this.shadowRoot.appendChild(temp);
    }
  });
})();
</script>

With the input field, the custom element is now focusable (document.activeElement will return the search-element when focus is on the input). This means any custom element could be focusable, making it impossible to use a whitelist of known native elements to determine focusability and tabablility.

If we can't reliably use a know list of selectors, that means that the only other way to know what is focusable is to actually test every element in the DOM to see if it moves the document.activeElement.

let focusable = [];
let els = document.body.querySelectorAll('*');
for (let i = 0; i < els.length; i++) {
  els[i].focus();
  if (els[i] === document.activeElement) {
    focusable.push(els[i]);
  }
}

Which of course is a terrible idea and will be slower the more elements your site has.

Regardless, custom elements again make this difficult since calling .focus() on them doesn't do anything. You could try to see if the element had a shadowRoot and then traverse it's DOM for focusable elements, but using attachShadow({mode: closed}) makes that impossible. The only way for custom elements to show that they are focusable is to use the little known delegatesFocus property.

In the end, developers have no good way to make an accessible modal without the consumer of the modal marking all focusable elements (or at least, the first and last focusable elements), and ensuring all focusable custom elements use the delegatesFocus proeprty.

domenic commented 7 years ago

In general this seems pretty reasonable to me, although using a selector makes this a request for the CSSWG instead of the HTML Standard. But we can certainly have the discussion here.

@tabatkins, @TakayoshiKochi, what do you think?

TakayoshiKochi commented 7 years ago

This is an excellent problem statement. Probably we can find lots of issues to be solved starting from this.

One of the complications is that an element can be focusable but not tabable (i.e. tabindex=-1). Another one is that even a focusable element cannot be tabable (e.g. style="display:none"). And also, sometimes whether an element is tabable or not depends on platform convention (e.g. <a href=...> on Mac Safari).

If we had a fictitious pseudo class :tabable that matches elements which is both focusable and tabable, finding the first element that should take focus from the returned NodeList of document.querySelectorAll(':focusable') might be not so easy - one reason is that tabindex can reorder the tabbing order, and something like flexbox can make the visual order of focusable elements quite complex.

One concern for implementation is, that Blink has (and maybe others have) an internal function to determine whether an element is focusable or not, but it depends on style/layout is complete so exposing the function to web may result in entangled dependency.

straker commented 7 years ago

I think we might be able to simplify it a bit. If we did have a fictitious pseudo class (or node filter for that matter), it would aways start at a root node looking for nodes under it that are focusable and tabable. Therefore, we can ignore all nodes outside of the root since we're only concerned with the roots decedents, even if they would theoretically be above the root in focus order.

I would expect the returned NodeList to be in focus order, so that grabbing the first/last index from the list would be the first and last focusable and tabable element from the root. So visual order / true DOM order isn't important (though the DOM order does determine focus order). I wonder if for this reason a query selector isn't the best interface since query selectors usually return elements in DOM order.

So if this were my DOM:

<dialog>
  <button>Close</button>
  <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit</p>
  <input type="hidden">Secret</input>
  <button tabindex="1">Send</button>
  <button style="float: left">Cancel</button>
</dialog>

<main>
  <button tabindex="10">
</main>

Then performing the querySelector on dialog would result in:

[<button tabindex="1">Send</button>, <button>Close</button>, <button style="float: left">Cancel</button>]

Essentially it would require walking the DOM, verifying that the first element is tabable (which each UA could use their own heuristic), then if the element is focusable (again, each UA's own heuristic), then inserting the DOM into the returned list in focus order (taking into account tabindex > 0).

domenic commented 7 years ago

I wonder if for this reason a query selector isn't the best interface since query selectors usually return elements in DOM order.

Yes, we definitely would not use querySelector if you had specific ordering requirements; we'd need a new API. But as @TakayoshiKochi says, determining that order is very expensive. Probably better would be to emulate similar primitives to what browsers have, i.e. getNextFocusableArea/getPreviousFocusableArea.

Of course there's another problem where the focusable areas are not elements or even nodes. I'm not sure how you'd want to handle that.

straker commented 7 years ago

Anything that moves the needle closer to getting this implemented would be better than trying to get it perfect and then be too difficult to implement. If the easiest thing would be to just return a list of all focusable and tabable elements not in tab order, I'd be ok with that.

I'm not sure what you mean by that last part.

TakayoshiKochi commented 7 years ago

The current Blink's implementation does not optimize much about detecting which element is focusable, so basically the cost of returning a list of focusable elements under a root is O(N) where N is the number of elements under the root, and each check is relatively costly. And if you require some more fancy ordering (tabindex, or visual) it adds some (much?) extra cost to it. Theoretically we can optimize to cache the result for repeating API calls, at the cost of maintaining the cache through DOM mutations or implementing lazy evaluation etc. So returning a list of focusable nodes (which is nearly equal to implement a pseudo class that matches focusable element), is not as easy as it sounds.

However at least all browsers implement (internal) functions to detect which element to focus next/prev when TAB/shift+TAB is pressed, so exposing them as getNextFocusableArea/getPreviousFocusableArea could be one way to get the whole list of focusable areas (you can repeatedly call the API from the root node until getting back to the root), without much implementation cost.

For the last part of domenic's https://github.com/whatwg/html/issues/2071#issuecomment-262424911 meant that "a focusable area" may not be a single element, e.g., <video> has some focusable areas within one element if controls are enabled. In this case, <video>.getNextFocusableArea() might not return what users would expect.

straker commented 7 years ago

Actually, if getNextFocusableArea/getPreviousFocusableArea were exposed to the client, then just those alone would fulfill the requirements for accessibility. For focusing the first element in the modal, all we would need to do is set focus to the modal using <dialog>.focus() and then call the getNextFocusableArea method, which would tell us the first focusable element in the modal. We could also call the same function in a loop to determine when the focus leaves the modal, or even call it every time tab is pressed to know when to put focus back to the first element.

For focus areas not being nodes/elements (such as the video element with controls), would it be easy to always return the node/element? document.activeElement returns the video element even when the focus area is inside of the controls, would it be possible to do the same for the getNextFocusableArea/getPreviousFocusableArea api? It would just mean that the same node would be returned multiple times as focus moves from area to area.

domenic commented 7 years ago

If the use case here is for modal elements and tab-wrapping, I think #897 is a proposal more fit to the use case.

straker commented 7 years ago

The primary use case was for auto-focusing the first focusable element in the dialog. Tab-wrapping was another use case, but #897 does fit better with that one. Maybe the two are related though as declaring a blocking element could use getNextFocusableArea/getPreviousFocusableArea as the primitive api, which would now be exposed to the client.

Another use case would be auto-closing a navigation menu when the user tabs off the final focusable element, such as in Heydon Pickering's aria submenu example. If you were trying to build a library for that behavior, having an api to know which elements are focusable would be very helpful.

robdodson commented 7 years ago

The primary use case was for auto-focusing the first focusable element in the dialog.

Just curious if the autofocus attribute could be used for this? Maybe if something gets added to the top layer (#897) it looks for and attempts to focus any child with autofocus?

alice commented 7 years ago

Tabbable I believe is possible today with TreeWalker (modulo Shadow DOM):

var treeWalker = document.createTreeWalker(
   document.body,
   NodeFilter.SHOW_ELEMENT,
   { acceptNode: function(node) { return (node.tabIndex >= 0 ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_SKIP); } },
   false
);
var nextFocusableNode = treeWalker.nextNode();

Note that the tabIndex property does not directly access the tabindex attribute; instead it gives you the computed "tab index" which takes implicit focusability into account. (Also: this doesn't seem to take inert-ness [currently only an issue with modal <dialog> and Chrome-only] into account, but that may be a bug in Chrome.)

Programmatic focusability is a whole other issue, though. tabIndex being less than zero means either the element is unfocusable, or that it is focusable but not tabbable, with no way to distinguish between those cases.

straker commented 7 years ago

With #1929 changing the modal requirement to focus the dialog instead of the first focusable element (or using autofocus to manually focus an element), and #897 taking care of trapping tab focus, I believe this issues primary use cases have been resolved. I'm going to close this issue, thanks for the great discussions.

pheki commented 3 months ago

In my case, I want to detect whether a <button> element is focusable so I can have a workaround as Safari buttons are not focusable, not even with tabindex. Unfortunately it seems like the only way to do that is user agent detection.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/button#clicking_and_focus https://bugs.webkit.org/show_bug.cgi?id=22261 https://stackoverflow.com/questions/42758815/safari-focus-event-doesnt-work-on-button-element

whatwg / html

Add support for determining which elements are focusable and tabable #2071

Proposal

Why?