Exploration into the adoption of the EditContext API

The EditContext API is a new API that decouples the rendering of the edit state of an HTML dom element from the dom element. The EditContext has the following interface (click to expand):

EditContext API Interface

``` dictionary EditContextInit { DOMString text; unsigned long selectionStart; unsigned long selectionEnd; }; interface EditContext : EventTarget { constructor(optional EditContextInit options = {}); undefined updateText(unsigned long rangeStart, unsigned long rangeEnd, DOMString text); undefined updateSelection(unsigned long start, unsigned long end); undefined updateControlBound(DOMRect controlBound); undefined updateSelectionBound(DOMRect selectionBound); undefined updateCharacterBounds(unsigned long rangeStart, sequence) characterBounds); sequence attachedElements(); readonly attribute DOMString text; readonly attribute unsigned long selectionStart; readonly attribute unsigned long selectionEnd; readonly attribute unsigned long compositionRangeStart; readonly attribute unsigned long compositionRangeEnd; readonly attribute boolean isInComposition; readonly attribute DOMRect controlBound; readonly attribute DOMRect selectionBound; readonly attribute unsigned long characterBoundsRangeStart; sequence characterBounds(); attribute EventHandler ontextupdate; attribute EventHandler ontextformatupdate; attribute EventHandler oncharacterboundsupdate; attribute EventHandler oncompositionstart; attribute EventHandler oncompositionend; }; ```

The adoption of the EditContext API has the following benefits:

It is likely to solve some of the IME issues we are currently seeing today
It would allow more fine grained control of the rendering of the editor text and therefore the screen reader experience

Some exploration has already been done previously on the branch https://github.com/microsoft/vscode/pull/207699. This iteration we have continued working on this exploration. More specifically we have focused on the following three sub-problems that arose in the first exploration:

Enabling Screen Reader Users with the EditContext

Currently to read accessible text, there exists a hidden text area positioned behind the visible text containing the current active line's text. This text is focused and read by a screen reader when a user navigates in the editor. An attempt has been made to use the EditContext API alongside this textarea but it was soon discovered that the EditContext does not support textarea elements. We are therefore currently exploring using other HTML elements to generate this hidden element and keep screen reader support. Here are some ideas that have been considered:

1. Use an HTML element like \<div> containing the text to be read

In order to enable IME support the text in the hidden area contains the active line's text as well as the text of a couple of lines before it and after it. When a user navigates up and down the editor, the text is updated to match the current edit state. In this case, if you use a \<div> element which directly contains the text then the screen reader reads the updated text on every single ArrowUp and ArrowDown commands. We want to avoid that, so a few ideas have been considered:

Instead of adding the text for several consecutive lines, add the text of the current active line only. In this case we would not have any problem with the update because the screen reader would read the full text which is just the current line text. This solution is not ideal, because it would make IME completions less accurate.
Tweak the aria attributes of the outer div so that on focus it always reads out the predetermined text of the current active line. An issue that is likely to arise however is that the outer div would be focused not the specific line that would be read out, creating an inconsistency between the screen reader focus box and the text that is read aloud.
Place each line into a separate HTML element, in the hopes that with some additional work, the screen reader would read only the current line's text on update. This approach is explored in the next section.

The aria-role that has been chosen for the div is textbox. This aria-role mirrors the aria-role of the current textarea. It's attribute aria-multiline has been set to true. Setting the aria-role to textbox and aria-multiline = true does not seem to change how text in the div is read on update - the full text is still read on update. Other aria-roles could still be explored, perhaps there is a more suitable one.

There were some focus problems in the initial implementation. When you would navigate with the Up and Down arrows, the caret would move down, the hidden element inner content would be updated and focus would be lost and set to the whole VS Code window. It is unclear as of yet why this happens, but it has been noticed that setting contenteditable = true on the div element fixes the issue.

2. Use a \<div> element containing only the active text.

Initially I tried changing the textContent of the div directly and set it to the current active line's text. When you would navigate from line to line however, the screen reader would read out only the diff from one line to the next. So if two consecutive lines share the same initial text, the screen reader would not read the full second line when entering on it. To mitigate this, the active line text was added into a div which was nested inside the parent div. In this manner the screen reader read the whole line when navigating upon it.

After this change, several other issues remained. The first is that when using the Left and Right arrow keys, the screen reader would read out the full updated line anew. To mitigate this, the code was changed so that the inner div was updated only when the line of the cursor selection would change. The second problem is that the Left and Right arrows would not automatically shift the focused black border of the screen reader. At some point focus is lost from the hidden area and set to the full window. Initially I thought would need to set contentEditable to true both on the parent div and the child div, but actually it was sufficient to set role:textbox on the parent div and the inner div.

3. Use a \<div> element with \<div> or \<p> children containing the individual lines to be read

To address the issue of the full text being read on div update, some exploration has been done into using separate HTML elements for every single line. More specifically the following ideas have been considered:

Place each line into a separate HTML element, make them focusable, and navigated in this manner between the different tabable lines. This solution is not ideal, as we would ideally only want the outer div to focusable.
Place each line into a separate HTML element and use the aria-activedescendant attribute on the parent element to specify which of its children is currently active.

The second idea has been explored. The MDN website says this about the aria-activedescendant attribute:

The aria-activedescendant property provides a method of managing focus for assistive technologies on interactive elements when they contain multiple focusable descendants, such as menus, grids, and toolbars. Instead of the screen reader moving focus between owned elements, aria-activedescendant can be used on container elements to refer to the currently active element, informing assistive technology users of the currently active element when focused.

The website mentions aria-activedescendant attribute should be set alongside the aria-controls attribute. Initial work successfully allows the user to navigate in the hidden text area like in the current implementation. In particular the black focus box of voiceover surrounds the current active line on ArrowUp and ArrowDown and surrounds the individual characters on ArrowLeft and ArrowRight.

The focus problem however remains. When the mouse move Left and Right, it gets stuck as focus is moved out. When the mouse moves over an empty span, the full div is focused, not the child containing nothing. Two solutions have been found:

Give the parent and the children nodes the role textbox
Give the parent and the children the attribute contenteditable=true

Paragraph child elements have been tested. This does not seem to affect the screen reader text.

Setting aria-multiline=true seems to read the insertion position.

TODO:

Screen reader reads the selection position which it does not on current implementation. Screen reader also reads consistently edit text when moving to a new line.
- Maybe if not made contenteditable, then this would no longer be read.

Enabling usage of `Enter` to add new line

As is mentioned in the following issue https://github.com/w3c/edit-context/issues/94 when the Enter key is pressed, the textupdate event is not fired. More specifically on this link https://w3c.github.io/edit-context/#handle-input-for-editcontext, the following is written:

The inputTypes handled by EditContext are those which operate only on raw text. Other inputTypes that depend on formats, clipboard/dragdrop, undo, or browser UI like spellcheck cannot be handled by EditContext since EditContext's state does not include these concepts. If an author wants their application to handle those inputTypes, they need to process them manually in a beforeinput event handler.

As such, to implement the usage of Enter, we need to listen to the beforeinput event as follows (taken from exploration branch):

this._domElement.domNode.addEventListener('beforeinput', e => {
    if (e.inputType === 'insertParagraph' || e.inputType === 'insertLineBreak') {
        this._handleTextUpdate(this._editContextState.positionOffset, this._editContextState.positionOffset, '\n')
    }
});

Enabling copy/paste to add new line

Similarly as for the Enter case, copy/pasting functionality needs to be implemented outside of the edit context. This was done by listening on the copy event of the hidden element and the keydown event to detect the paste event.

cc @hediet

This comment will be about the more in depth exploration following the initial exploration.

The initial implementation was to have a div with textContent the content of the line under the cursor (the edit context contained more lines above and below the line under the cursor that were used as context for the completions). The issue with this implementation is that as you would navigate from line n to line n+1, the screen reader would only read the difference between the contents on those lines. For the following code:

this._register(this._editor.onDidChangeSelection());
this._register(this._editor.onDidChangeModel());

As the cursor moves down from the top line to the bottom line, the screen reader reads only Model()) because that is the difference between the two lines. To avoid this issue, it was decided to place the content inside of a div nested inside of the outer root div. In this manner, if we update the whole child div with the new text content, the screen reader reads the full line, not just the difference.

Initially there were some cursor navigation issues. If the cursor was moved down to another line with the ArrowDown arrow, focus would be lost and the next time ArrowDown would be used, the cursor could no longer be moved down. This happens only when the screen reader is used - navigation is correct and as expected without the screen reader. It has been noticed that setting the role to textbox keeps the focus on the hidden area. Similarly it has been noticed that when using ArrowLeft and ArrowRight, focus would be lost after moving the cursor the first time, and it would no longer be possible to navigate left and right anymore. Setting the inner div role to texbox fixed this issue. Another issue was that as you would navigate from line to line, the black box or the screen reader cursor rendered by the screen reader was placed around the VS Code window, not around the actual hidden area. I still need to investigate why this happens. In any case, setting the attribute aria-activedescendant on the parent dom node so it points to the child div fixed this issue.

The implementation at that point posed some issues with the multi-selection. When a multi-selection was done, since the inner dom node was updated when the selection changed, the screen reader would reread all the text on the lines corresponding to the selection. In other words, the screen reader would read all of the lines from the very beginning even if the content was not covered by the selection. In the current implementation of the hidden area, when you first create a selection, the screen reader reads the content in the selection and when you drag it further, the screen reader reads what is the new content that has been selected, or the content that has been unselected. In order to replicate this behavior the following is done in the rendering function:

Take the content in the selection that needs to be rendered. Split the content by the new lines to get an array of lines of the content.
If the current selection is a subselection of the previous selection (meaning the selection has been decreased by one line), then remove the last child/first child node of the root node (depending on what line has been removed from the selection).
If the current selection is a superselection of the previous selection (meaning the selection has been increased by one line), then create a div with the new selection line content and append it to the beginning/the end of the root node (depending on if the selection is increased upwards or downwards)
Otherwise rerender the children nodes. Clear the root node children. For each line of the content in the selection, create a div element with textContent this content and append to the root node.

This code essentially keeps the children nodes that already exist, removes those that are no longer needed if the selection decreases and renders new child nodes if the selection increases. With this implementation, the screen reader reads only the selected content, not the full content. This implementation has an issue in that in certain cases the unselected text is no longer read because the nodes that are not covered by the selection are removed before the screen reader can read the change. To mitigate this issue, the code has been changed so that these nodes are kept in the root dom node, and on the next selection change, they are removed.

Some other points include:

The code above is executed only when one of the selection start line number or the selection end line number is updated. If the selection start or end column numbers changes but the start and end line numbers do not change, then we do not need to update the hidden area.
Recall that the attribute aria-activedescendant was used to limit the screen reader cursor to the hidden element. When a selection is done, we'd like the screen reader cursor to surround the root node not one of its child nodes. Hence when the selection changes, the attributes are updated as follows:

if (selection.isEmpty()) {
    domNode.setAttribute('aria-activedescendant', `edit-context-content`);
    domNode.setAttribute('aria-controls', 'native-edit-context');
} else {
    domNode.removeAttribute('aria-activedescendant');
    domNode.removeAttribute('aria-controls');
}

For the screen reader to read the right content, we need to set the dom document selection on the hidden area so it coincides with the editor selection.
When the selection changes, the top and left position of the hidden area need to be updated so that the hidden area content location coincides with the location of the content in the editor.
When the editor font size changes, the font size in the hidden area also needs to be changed.

issues

The screen reader always read insertion at <some position> ... edit text when you change the selection. This happens because the inner div has role textbox and since it is continuously rerendered, the screen reader always reads out that we are in a textbox.
- We could remove the textbox role on the inner div. In this case there are issues with the focusing behavior as was described earlier in this comment. Without the texbox role, the screen reader does not read the additional words insertion at ... but there are focusing issues.
- We could remove the attribute aria-activedescendant so that the inner div is not focused and hence the screen reader does not reread the edit text text. As was written before, when doing this, it has been noticed that the black box of the screen reader has a strange focusing behavior and furthermore the screen reader does not read correctly the content in the selection.
In the current implementation, after doing a selection on multiple lines, if you press the DownArrow key, the cursor of the screen reader remains at the previous position, even if the editor cursor is on a new line and the hidden area has been updated. You need to press once again some keys for the black box to be repositioned correctly on the hidden area.
When the hidden area content contains only a new line character, the screen reader cursor for some reason surrounds the whole editor window, not just the hidden area. In the current implementation, the cursor disappears when on an empty line.
In the current implementation, when the cursor surrounds a full line and the editor is scrolled to the right, the cursor changes so as to surround only the specific letter on its position. In my implementation, the cursor stays around the whole line.

TODO

Test the accessibility with NVDA on Windows and see what the differences are between the current implementation and my implementation

findings

Maybe don't need role textbox on the outer parent with the new implementation

microsoft / vscode

Exploration into adoption of EditContext API in VS Code #222010

Exploration into the adoption of the EditContext API

Enabling Screen Reader Users with the EditContext

TODO:

Enabling usage of `Enter` to add new line

Enabling copy/paste to add new line

microsoft / vscode

Exploration into adoption of EditContext API in VS Code #222010

Exploration into the adoption of the EditContext API

Enabling Screen Reader Users with the EditContext

TODO:

Enabling usage of Enter to add new line

Enabling copy/paste to add new line

Enabling usage of `Enter` to add new line