LaZay commented 2 months ago

Description of bug or feature request

For accessible video, we can qualify some tracks as subtitles/captions/descriptions, thanks to the kind attribute. For accessible audio only (i.e. outside apart from videos), there no such equivalent. So far, audio transcripts are simply associated to their audio though a div element, in order to group both the audio and its transcript. Transcripts can modeled in different manners through HTML elements (a, details, div, p), that can be used for other information than transcripts (ex: download, sourcing, etc.). As a consequence, there is no way to distinguish an audio transcript from another information.

Could it be possible to add on HTML elements a, details, div, p the following:

either new ARIA role values = subtitles / captions / descriptions;
or a new attribute kind.

This would provide the means to qualify as such every HTML content provided by the authors to make audio/video accessible (even in an unsynchronized mode):

Audio transcripts for deaf persons could be typed as captions (i.e. role/kind=captions).
Audio transcripts for everybody could be typed as subtitles (i.e. role/kind=subtitles).
Unsynchronized video subtitles for everybody could be typed as such (i.e. role/kind=subtitles).
Unsynchronized video captions for deaf persons could be typed as such (i.e. role/kind=captions).
Unsynchronized video descriptions could be typed as such (i.e. role/kind=descriptions).

Will this require a change to CORE-AAM?

If unknown, leave blank. If relevant, link bug.

Will this require a change to the ARIA authoring guide?

If unknown, leave blank. If relevant, link bug.

spectranaut commented 2 months ago

Discussed briefly during new issue triage: https://www.w3.org/2024/04/25-aria-minutes.html#t01

aardrian commented 2 months ago

Both <video> and <audio> (media elements in HTML5) support the <track> element. As you know, the kind attribute allows for captions, subtitles, and audio description but not transcripts.

<track> points to a file in a specified format (such as WebVTT), but that content is not part of the page itself. It is there for a media player to present to the user. As such, the element is not generally exposed directly to the user.

A transcript can live within the same page as the video or elsewhere. For the scope of the web, a transcript has no technical spec behind it (except that of the host language).

None of <video>, <audio>, nor <track> has a corresponding ARIA role. Though the first two have a computed role from the browser.

I am covering all this to set the baseline representation in standards today. Which then brings me to some questions to try to understand the request...

Besides providing a description via its role, how do you envision it being exposed to users?
What benefits would it offer over using only a heading or being the destination of a direct link from a video caption?
What would it offer that differs from an <article> with an accessible name (thereby functioning as a named region / landmark)?
There are cases where media players allow individual lines in transcript-like presentations to jump the media to that timestamp (Apple, YouTube, assorted embeddable players). Are you thinking of something to enable or contain that natively?
Is there a reason your proposal could not be satisfied with an HTML element?
Your issue references EPUB. Is your request specific to EPUB or is it a broader request for the web platform?

LaZay commented 2 months ago

CONTEXT 1) We need to produce born accessible publications for learners (school, college, university) without having use conflict between user profiles (non-disabled, visually-impaired, hearing-impaired persons). This requires to "hide" from users some content, the one that is dedicated to a handicap they do not have (e.g. noise or even worst, answer to the question). Otherwise, for transcripts typically, we have a use conflict between hearing-impaired learners and others learners. 2) ELearning content is nowadays always based on web technologies. It can be packaged, and are distributed in true LMS, web application or EPUB (more rarely today though). The reading systems, and the ATs must be able to expose (or not) transcripts depending on the user profile/preferences, without using proprietary extensions. For ROI and interoperability sake, we need to stick to standards (HTML/ARIA/...) as much as possible. 3) As transcripts can be implemented in differents ways, we need a transverse semantic attribute to identifiy them. Such an attribute is much more powerfull than an element title or an ARIA name. As a matter of fact, it can be operated by softwares to be automatically skipped according user profiles/preferences (blinds do not want to be disturbed, and delayed by content dedicated to deaf).

CLARIFICATION Please find my answers to your questions:

In the long term, LMS / Web apps / EPUB reading systems / ATs could decide to automatically expose (or not) transcripts according to user profiles/preferences (when these softwares do support such parameters).
The benefit is not in the implementation of the transcript, which can remain free.
It is a good idea to group the media and its transcript (better than a div indeed), but article element is not dedicated to this use, and has poor semantics.
The request does not concern synchronized transcripts/subtitles in a media player (audio player + video player). It only concerns unsynchronized transcripts (mostly audio players, rarely video players due to WCAG constraint on accessible video captions which must be sychronized).
Block titles are optional. Aria names are optional. Their values are free, which mans that they cannot be easily operated by machines.
The request is not restricted to EPUB format. It also adresses every content structured in HTML5.

Hope it is clearer.

w3c / aria

New role for audio transcripts #2164

Description of bug or feature request

Will this require a change to CORE-AAM?

Will this require a change to the ARIA authoring guide?