w3c / aria

Accessible Rich Internet Applications (WAI-ARIA)
https://w3c.github.io/aria/
Other
631 stars 120 forks source link

New role for audio transcripts #2164

Open LaZay opened 2 months ago

LaZay commented 2 months ago

Description of bug or feature request

For accessible video, we can qualify some tracks as subtitles/captions/descriptions, thanks to the kind attribute. For accessible audio only (i.e. outside apart from videos), there no such equivalent. So far, audio transcripts are simply associated to their audio though a div element, in order to group both the audio and its transcript. Transcripts can modeled in different manners through HTML elements (a, details, div, p), that can be used for other information than transcripts (ex: download, sourcing, etc.). As a consequence, there is no way to distinguish an audio transcript from another information.

Could it be possible to add on HTML elements a, details, div, p the following:

This would provide the means to qualify as such every HTML content provided by the authors to make audio/video accessible (even in an unsynchronized mode):

Will this require a change to CORE-AAM?

If unknown, leave blank. If relevant, link bug.

Will this require a change to the ARIA authoring guide?

If unknown, leave blank. If relevant, link bug.

spectranaut commented 2 months ago

Discussed briefly during new issue triage: https://www.w3.org/2024/04/25-aria-minutes.html#t01

aardrian commented 2 months ago

Both <video> and <audio> (media elements in HTML5) support the <track> element. As you know, the kind attribute allows for captions, subtitles, and audio description but not transcripts.

<track> points to a file in a specified format (such as WebVTT), but that content is not part of the page itself. It is there for a media player to present to the user. As such, the element is not generally exposed directly to the user.

A transcript can live within the same page as the video or elsewhere. For the scope of the web, a transcript has no technical spec behind it (except that of the host language).

None of <video>, <audio>, nor <track> has a corresponding ARIA role. Though the first two have a computed role from the browser.

I am covering all this to set the baseline representation in standards today. Which then brings me to some questions to try to understand the request...

LaZay commented 2 months ago

CONTEXT 1) We need to produce born accessible publications for learners (school, college, university) without having use conflict between user profiles (non-disabled, visually-impaired, hearing-impaired persons). This requires to "hide" from users some content, the one that is dedicated to a handicap they do not have (e.g. noise or even worst, answer to the question). Otherwise, for transcripts typically, we have a use conflict between hearing-impaired learners and others learners. 2) ELearning content is nowadays always based on web technologies. It can be packaged, and are distributed in true LMS, web application or EPUB (more rarely today though). The reading systems, and the ATs must be able to expose (or not) transcripts depending on the user profile/preferences, without using proprietary extensions. For ROI and interoperability sake, we need to stick to standards (HTML/ARIA/...) as much as possible. 3) As transcripts can be implemented in differents ways, we need a transverse semantic attribute to identifiy them. Such an attribute is much more powerfull than an element title or an ARIA name. As a matter of fact, it can be operated by softwares to be automatically skipped according user profiles/preferences (blinds do not want to be disturbed, and delayed by content dedicated to deaf).

CLARIFICATION Please find my answers to your questions:

Hope it is clearer.