microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
162.16k stars 28.53k forks source link

Exploration into adoption of EditContext API in VS Code #222010

Closed aiday-mar closed 2 days ago

aiday-mar commented 1 month ago

Exploration into the adoption of the EditContext API

The EditContext API is a new API that decouples the rendering of the edit state of an HTML dom element from the dom element. The EditContext has the following interface (click to expand):

EditContext API Interface ``` dictionary EditContextInit { DOMString text; unsigned long selectionStart; unsigned long selectionEnd; }; interface EditContext : EventTarget { constructor(optional EditContextInit options = {}); undefined updateText(unsigned long rangeStart, unsigned long rangeEnd, DOMString text); undefined updateSelection(unsigned long start, unsigned long end); undefined updateControlBound(DOMRect controlBound); undefined updateSelectionBound(DOMRect selectionBound); undefined updateCharacterBounds(unsigned long rangeStart, sequence) characterBounds); sequence attachedElements(); readonly attribute DOMString text; readonly attribute unsigned long selectionStart; readonly attribute unsigned long selectionEnd; readonly attribute unsigned long compositionRangeStart; readonly attribute unsigned long compositionRangeEnd; readonly attribute boolean isInComposition; readonly attribute DOMRect controlBound; readonly attribute DOMRect selectionBound; readonly attribute unsigned long characterBoundsRangeStart; sequence characterBounds(); attribute EventHandler ontextupdate; attribute EventHandler ontextformatupdate; attribute EventHandler oncharacterboundsupdate; attribute EventHandler oncompositionstart; attribute EventHandler oncompositionend; }; ```

The adoption of the EditContext API has the following benefits:

Some exploration has already been done previously on the branch https://github.com/microsoft/vscode/pull/207699. This iteration we have continued working on this exploration. More specifically we have focused on the following three sub-problems that arose in the first exploration:

Enabling Screen Reader Users with the EditContext

Currently to read accessible text, there exists a hidden text area positioned behind the visible text containing the current active line's text. This text is focused and read by a screen reader when a user navigates in the editor. An attempt has been made to use the EditContext API alongside this textarea but it was soon discovered that the EditContext does not support textarea elements. We are therefore currently exploring using other HTML elements to generate this hidden element and keep screen reader support. Here are some ideas that have been considered:

1. Use an HTML element like \<div> containing the text to be read

In order to enable IME support the text in the hidden area contains the active line's text as well as the text of a couple of lines before it and after it. When a user navigates up and down the editor, the text is updated to match the current edit state. In this case, if you use a \<div> element which directly contains the text then the screen reader reads the updated text on every single ArrowUp and ArrowDown commands. We want to avoid that, so a few ideas have been considered:

The aria-role that has been chosen for the div is textbox. This aria-role mirrors the aria-role of the current textarea. It's attribute aria-multiline has been set to true. Setting the aria-role to textbox and aria-multiline = true does not seem to change how text in the div is read on update - the full text is still read on update. Other aria-roles could still be explored, perhaps there is a more suitable one.

There were some focus problems in the initial implementation. When you would navigate with the Up and Down arrows, the caret would move down, the hidden element inner content would be updated and focus would be lost and set to the whole VS Code window. It is unclear as of yet why this happens, but it has been noticed that setting contenteditable = true on the div element fixes the issue.

2. Use a \<div> element containing only the active text.

Initially I tried changing the textContent of the div directly and set it to the current active line's text. When you would navigate from line to line however, the screen reader would read out only the diff from one line to the next. So if two consecutive lines share the same initial text, the screen reader would not read the full second line when entering on it. To mitigate this, the active line text was added into a div which was nested inside the parent div. In this manner the screen reader read the whole line when navigating upon it.

After this change, several other issues remained. The first is that when using the Left and Right arrow keys, the screen reader would read out the full updated line anew. To mitigate this, the code was changed so that the inner div was updated only when the line of the cursor selection would change. The second problem is that the Left and Right arrows would not automatically shift the focused black border of the screen reader. At some point focus is lost from the hidden area and set to the full window. Initially I thought would need to set contentEditable to true both on the parent div and the child div, but actually it was sufficient to set role:textbox on the parent div and the inner div.

3. Use a \<div> element with \<div> or \<p> children containing the individual lines to be read

To address the issue of the full text being read on div update, some exploration has been done into using separate HTML elements for every single line. More specifically the following ideas have been considered:

The second idea has been explored. The MDN website says this about the aria-activedescendant attribute:

The aria-activedescendant property provides a method of managing focus for assistive technologies on interactive elements when they contain multiple focusable descendants, such as menus, grids, and toolbars. Instead of the screen reader moving focus between owned elements, aria-activedescendant can be used on container elements to refer to the currently active element, informing assistive technology users of the currently active element when focused.

The website mentions aria-activedescendant attribute should be set alongside the aria-controls attribute. Initial work successfully allows the user to navigate in the hidden text area like in the current implementation. In particular the black focus box of voiceover surrounds the current active line on ArrowUp and ArrowDown and surrounds the individual characters on ArrowLeft and ArrowRight.

The focus problem however remains. When the mouse move Left and Right, it gets stuck as focus is moved out. When the mouse moves over an empty span, the full div is focused, not the child containing nothing. Two solutions have been found:

  1. Give the parent and the children nodes the role textbox
  2. Give the parent and the children the attribute contenteditable=true

Paragraph child elements have been tested. This does not seem to affect the screen reader text.

Setting aria-multiline=true seems to read the insertion position.

TODO:

Enabling usage of Enter to add new line

As is mentioned in the following issue https://github.com/w3c/edit-context/issues/94 when the Enter key is pressed, the textupdate event is not fired. More specifically on this link https://w3c.github.io/edit-context/#handle-input-for-editcontext, the following is written:

The inputTypes handled by EditContext are those which operate only on raw text. Other inputTypes that depend on formats, clipboard/dragdrop, undo, or browser UI like spellcheck cannot be handled by EditContext since EditContext's state does not include these concepts. If an author wants their application to handle those inputTypes, they need to process them manually in a beforeinput event handler.

As such, to implement the usage of Enter, we need to listen to the beforeinput event as follows (taken from exploration branch):

this._domElement.domNode.addEventListener('beforeinput', e => {
    if (e.inputType === 'insertParagraph' || e.inputType === 'insertLineBreak') {
        this._handleTextUpdate(this._editContextState.positionOffset, this._editContextState.positionOffset, '\n')
    }
});

Enabling copy/paste to add new line

Similarly as for the Enter case, copy/pasting functionality needs to be implemented outside of the edit context. This was done by listening on the copy event of the hidden element and the keydown event to detect the paste event.

cc @hediet

hediet commented 1 month ago

This solution is not ideal, because it would make IME completions less accurate.

That is not a problem! IME completions only work on the edit context, not the DOM. Just multi-line selection could be problematic.

When you would navigate with the Up and Down arrows, the caret would move down, the hidden element inner content would be updated and focus would be lost

I think this might be because you have to restore selection after updating the text maybe?

aiday-mar commented 1 month ago

I'll respond in slack

aiday-mar commented 1 month ago

This comment will be about the more in depth exploration following the initial exploration.

The initial implementation was to have a div with textContent the content of the line under the cursor (the edit context contained more lines above and below the line under the cursor that were used as context for the completions). The issue with this implementation is that as you would navigate from line n to line n+1, the screen reader would only read the difference between the contents on those lines. For the following code:

this._register(this._editor.onDidChangeSelection());
this._register(this._editor.onDidChangeModel());

As the cursor moves down from the top line to the bottom line, the screen reader reads only Model()) because that is the difference between the two lines. To avoid this issue, it was decided to place the content inside of a div nested inside of the outer root div. In this manner, if we update the whole child div with the new text content, the screen reader reads the full line, not just the difference.

Initially there were some cursor navigation issues. If the cursor was moved down to another line with the ArrowDown arrow, focus would be lost and the next time ArrowDown would be used, the cursor could no longer be moved down. This happens only when the screen reader is used - navigation is correct and as expected without the screen reader. It has been noticed that setting the role to textbox keeps the focus on the hidden area. Similarly it has been noticed that when using ArrowLeft and ArrowRight, focus would be lost after moving the cursor the first time, and it would no longer be possible to navigate left and right anymore. Setting the inner div role to texbox fixed this issue. Another issue was that as you would navigate from line to line, the black box or the screen reader cursor rendered by the screen reader was placed around the VS Code window, not around the actual hidden area. I still need to investigate why this happens. In any case, setting the attribute aria-activedescendant on the parent dom node so it points to the child div fixed this issue.

The implementation at that point posed some issues with the multi-selection. When a multi-selection was done, since the inner dom node was updated when the selection changed, the screen reader would reread all the text on the lines corresponding to the selection. In other words, the screen reader would read all of the lines from the very beginning even if the content was not covered by the selection. In the current implementation of the hidden area, when you first create a selection, the screen reader reads the content in the selection and when you drag it further, the screen reader reads what is the new content that has been selected, or the content that has been unselected. In order to replicate this behavior the following is done in the rendering function:

This code essentially keeps the children nodes that already exist, removes those that are no longer needed if the selection decreases and renders new child nodes if the selection increases. With this implementation, the screen reader reads only the selected content, not the full content. This implementation has an issue in that in certain cases the unselected text is no longer read because the nodes that are not covered by the selection are removed before the screen reader can read the change. To mitigate this issue, the code has been changed so that these nodes are kept in the root dom node, and on the next selection change, they are removed.

Some other points include:

if (selection.isEmpty()) {
    domNode.setAttribute('aria-activedescendant', `edit-context-content`);
    domNode.setAttribute('aria-controls', 'native-edit-context');
} else {
    domNode.removeAttribute('aria-activedescendant');
    domNode.removeAttribute('aria-controls');
}

issues

TODO

findings

hediet commented 2 days ago

Duplicate of #207700