w3c / at-driver

AT Driver defines a protocol for introspection and remote control of assistive technology software, using a bidirectional communication channel.
https://w3c.github.io/at-driver
Other
32 stars 4 forks source link

Status Report for ARIA-AT Automation #9

Closed zcorpan closed 8 months ago

zcorpan commented 2 years ago

Overview

The following diagram shows how the aria-at tests are run in automation in our current implementation:

aria-at test for an APG example. → aria-at-harness runs an aria-at-test with at-driver. → at-driver simulates keypresses from the test instructions and records output. → show collected output.

Introduction

Bocoup has been working with the ARIA-AT Community Group on interoperability testing for screen readers and browser combinations. Currently, the tests require that a human operator uses a computer to observe and interpret the behavior of the system under test. We have developed ARIA-AT App for running these tests manually.

Since it is impractical to run all tests manually on a regular basis, we are exploring ways to automate running these tests.

Vision

We envision a reliable infrastructure for automatically running all aria-at tests in all relevant screen reader / browser / OS combinations, and providing a public “dashboard”-style website similar to wpt.fyi or test262.report for viewing test results. Further, we envision AT vendors and browser vendors running the aria-at tests as part of their regression testsuite and that they will be able to contribute changes to the tests with minimal overhead, similarly to e.g. web-platform-tests for Chromium and Gecko browser engines today. The test results could also be embedded in various locations for a web developer audience, e.g. the APG design patterns and support tables on relevant MDN pages.

Over time, screen readers and the OS-level and browser-level layers of the accessibility stack align in their support for ARIA and the APG design patterns such that web developers can develop an accessible experience and test in one screen reader and there will be no surprises when testing in another screen reader.

Progress in 2021

To enable automated testing, we have implemented a screen-reader agnostic prototype aria-at-automation-driver for Windows. The implementation simulates keypresses on the OS level, and uses a custom voice that captures the utterances from the screen reader as text data. We are researching the technical feasibility of a similar approach on macOS.

We have also developed a new harness to enable running the aria-at tests using at-driver.

The automated test results will be available at https://bocoup.github.io/aria-at-automation-results-viewer/ (the URL may change). We expect to iterate on how to visualize the results and include more detailed information.

The approach we have implemented allows us to get started with running tests without any changes to screen readers, but it has some limitations. In particular, it is not ideal for changing screen reader settings, and it doesn’t have insight into the state of the screen reader, e.g., virtual focus position, mode (interaction mode vs. reading mode). We propose addressing this by standardizing a WebDriver-like API, see the ARIA-AT Automation API Explainer.

Open issues/questions

Resources

eeejay commented 2 years ago

Hi, thanks for this proposal. It would be a great benefit to have a good picture on our interoperability across browsers, platforms and ATs.

In 2020 we added VoiceOver support to Firefox. We wrote a couple of tools to help us both benchmark performance and automate precise keyboard interactions and collect VoiceOver output from those interactions. This allowed us to have a better understanding of what work we needed to get done.

We only worked on them as much as we needed in order to get the ob done, so they aren't complete by a long show. You can see them in this repository. The BenchmarkSynthesizer is a null-output voice that sends an HTTP to localhost request on each speak command, so you can create an event based automation. The VO Tester directory has the scripts for synthesizing input and listening for synthsis requests from the null-output voice.