Status Report for ARIA-AT Automation

Overview

The following diagram shows how the aria-at tests are run in automation in our current implementation:

aria-at test for an APG example. → aria-at-harness runs an aria-at-test with at-driver. → at-driver simulates keypresses from the test instructions and records output. → show collected output.

Introduction

Bocoup has been working with the ARIA-AT Community Group on interoperability testing for screen readers and browser combinations. Currently, the tests require that a human operator uses a computer to observe and interpret the behavior of the system under test. We have developed ARIA-AT App for running these tests manually.

Since it is impractical to run all tests manually on a regular basis, we are exploring ways to automate running these tests.

Vision

We envision a reliable infrastructure for automatically running all aria-at tests in all relevant screen reader / browser / OS combinations, and providing a public “dashboard”-style website similar to wpt.fyi or test262.report for viewing test results. Further, we envision AT vendors and browser vendors running the aria-at tests as part of their regression testsuite and that they will be able to contribute changes to the tests with minimal overhead, similarly to e.g. web-platform-tests for Chromium and Gecko browser engines today. The test results could also be embedded in various locations for a web developer audience, e.g. the APG design patterns and support tables on relevant MDN pages.

Over time, screen readers and the OS-level and browser-level layers of the accessibility stack align in their support for ARIA and the APG design patterns such that web developers can develop an accessible experience and test in one screen reader and there will be no surprises when testing in another screen reader.

Progress in 2021

To enable automated testing, we have implemented a screen-reader agnostic prototype aria-at-automation-driver for Windows. The implementation simulates keypresses on the OS level, and uses a custom voice that captures the utterances from the screen reader as text data. We are researching the technical feasibility of a similar approach on macOS.

We have also developed a new harness to enable running the aria-at tests using at-driver.

The automated test results will be available at https://bocoup.github.io/aria-at-automation-results-viewer/ (the URL may change). We expect to iterate on how to visualize the results and include more detailed information.

The approach we have implemented allows us to get started with running tests without any changes to screen readers, but it has some limitations. In particular, it is not ideal for changing screen reader settings, and it doesn’t have insight into the state of the screen reader, e.g., virtual focus position, mode (interaction mode vs. reading mode). We propose addressing this by standardizing a WebDriver-like API, see the ARIA-AT Automation API Explainer.

Open issues/questions

at-driver implementation
- Windows - what is the right way to simulate keypresses that is authentic enough to result in equivalent response from screen readers?
- macOS - is it possible to create an open source custom voice for macOS?
- iOS - is it possible to automate iOS VoiceOver testing?
- Android - is it possible to automate Android TalkBack testing?
AT vendors, browser vendors
- Is there interest in running the aria-at tests as part of your day-to-day development workflow?
- How might you handle flaky tests? (Per-project test expectations?)
- How might you handle tests that are known to fail? (Per-project test expectations?)
- How should the test assertions be automated? (Log all output to a file, annotate it as a “pass” or “fail” result? Or do we need per-assertion results?)
- Is the ARIA-AT Working Mode suitable for you?
- Are you comfortable with contributing changes to the tests? If not, why not?
- Is there interest in standardizing ARIA-AT Automation API, and to expose screen reader settings and internal state to this API?

w3c / at-driver