audio description
narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone
Note 1: Audio description of video provides information about actions, characters, scene changes, on-screen text, and other visual content.
Note 2: In standard audio description, narration is added during existing pauses in dialogue. (See also extended audio description.)
Note 3: Where all of the video information is already provided in existing audio, no additional audio description is necessary.
Note 4: Also called "video description" and "descriptive narration."
Why this definition should change
This definition is too limited. It focuses on “visual details” and glosses over the importance of text and graphics shared via video. It does not recognize the ever-growing prevalence of text-based content being used, both in live and in prerecorded materials.
Live presentations often include “screen shares” of text documents, spreadsheets, slideshows, graphs, charts, etc. When this type of content is shared, it is the heart of that portion of the presentation and, as such, is necessarily important enough to be audio described. However, this content is often NOT audio described based on either aesthetics or an ableist assumption that “everyone” can “just read” the content on the screen. This is the environment where WCAG is needed to both provide access AND educate society about the necessity for accessible materials.
Undeniably, some text captured in a video is not “important” enough to be described. For example, in a camera shot of a dozen fast-food restaurants (including their signage), the names of each restaurant might not be important to the video. Instead, a brief description that the shot shows “fast-food restaurants littered down both sides of the road into town” might be sufficient.
However, the existing WCAG definition of “audio description” is insufficient in many cases of text on the screen. (Note: I do not use the term “caption” in this recommendation. WCAG has defined “caption” in another context, and it would add ambiguity that we don’t need.) In many videos, text is used to convey meaning, but the current definition is ambiguous.
In common parlance, many would not include text in the definition of “visual details,” and what is “important” is exceptionally subjective.
• For example, some might not consider text on the screen that identifies a speaker (by name and/or position) to be “important.” Yet this information provides context and assists the listener in identifying voices. Additionally, the text is obviously somewhat important because the video creator has taken extra steps to add the text to the video (text overlay).
• In other cases, text is the entire subject of the video, such as in the increasingly-popular trend of holding up a series of hand-written signs instead of speaking in the video. In these cases, the text on the handwritten sign is the core of the video. While this type of type would likely be audio described under WCAG (the video is often either silent or playing music in the background), a better, more inclusive definition of audio description would make the need to voice these messages clear and unambiguous.
• Many videos also use documents to convey meaning, but the detail needed to adequately convey the meaning depends on the way the video uses the graphs or graphics for the sighted. Sometimes the graph is conveying a broad theme, and a general description will suffice: “Line graph shows the price of eggs getting higher over time.” In other cases, the graphic is used to convey specific information, and those graphics require audio description in order to make that information accessible (for example, each area of a Venn diagram should be described).
• In some cases, the documents (or graphics with text) used in the video are substantially or wholly text-based. This is a frequent occurrence with informational presentations (including educational content). In these cases, the document should be read aloud in its entirety. If portions of the document are “not important,” then they should not be displayed at all. “Accessibility” demands parity of access to information, so all consumers of the video must have that access.
[#Audio Description (recommended amendment)]()
A better definition of audio description would differentiate incidental text (such as the multiple restaurant signs) from intentional text (that which is added to the video (text overlay) or constitutes the subject of the video shot). Here is my suggestion (additions in ALL CAPS):
audio description
narration added to the soundtrack to describe important visual details OR INCIDENTAL TEXT that cannot be understood from the main soundtrack alone AND
VERBALIZATION OF TEXT OVERLAYS ADDED TO THE VIDEO THAT ARE NOT PROVIDED BY THE MAIN SOUNDTRACK AND
VERBALIZATION OF THE CONTENT OF DOCUMENTS AND GRAPHICS CONTAINING TEXT THAT ARE NOT PROVIDED BY THE MAIN SOUNDTRACK OR PROVISION OF AN ACCESSIBLE TEXT ALTERNATIVE TO THE VIEWER IN REAL-TIME
Note 1: Audio description of video provides information about actions, characters, scene changes, on-screen text, and other visual content AS WELL AS VERBALIZATION OF INCIDENTAL AND INTENTIONAL TEXT CONTAINED IN THE VIDEO CONTENT.
Note 2: In standard audio description, narration is added during existing pauses in dialogue. (See also extended audio description.) HOWEVER, STANDARD AUDIO DESCRIPTION INCLUDES ADDING PAUSES IN THE VIDEO AND/OR THE PROVISION OF ACCESSIBLE TEXT DOCUMENTS FOR DOWNLOAD WHENEVER PAUSES IN A VIDEO’S SOUNDTRACK DO NOT PROVIDE SUFFICIENT TIME TO VERBALIZE AUDIO DESCRIPTION.
Note 3: Where all of the video information is already provided in existing audio, no additional audio description is necessary.
Note 4: Also called "video description" and "descriptive narration."
NOTE 5: WHENEVER POSSIBLE, ACCESSIBLE TEXT ALTERNATIVES OF THE CONTENT OF DOCUMENTS DISPLAYED ON A VIDEO SHOULD BE AVAILABLE FOR DOWNLOAD AND SHOULD COMPLY WITH WCAG 1.3.1.
Audio Description (current)
audio description narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone Note 1: Audio description of video provides information about actions, characters, scene changes, on-screen text, and other visual content. Note 2: In standard audio description, narration is added during existing pauses in dialogue. (See also extended audio description.) Note 3: Where all of the video information is already provided in existing audio, no additional audio description is necessary. Note 4: Also called "video description" and "descriptive narration."
Why this definition should change
This definition is too limited. It focuses on “visual details” and glosses over the importance of text and graphics shared via video. It does not recognize the ever-growing prevalence of text-based content being used, both in live and in prerecorded materials. Live presentations often include “screen shares” of text documents, spreadsheets, slideshows, graphs, charts, etc. When this type of content is shared, it is the heart of that portion of the presentation and, as such, is necessarily important enough to be audio described. However, this content is often NOT audio described based on either aesthetics or an ableist assumption that “everyone” can “just read” the content on the screen. This is the environment where WCAG is needed to both provide access AND educate society about the necessity for accessible materials.
Undeniably, some text captured in a video is not “important” enough to be described. For example, in a camera shot of a dozen fast-food restaurants (including their signage), the names of each restaurant might not be important to the video. Instead, a brief description that the shot shows “fast-food restaurants littered down both sides of the road into town” might be sufficient.
However, the existing WCAG definition of “audio description” is insufficient in many cases of text on the screen. (Note: I do not use the term “caption” in this recommendation. WCAG has defined “caption” in another context, and it would add ambiguity that we don’t need.) In many videos, text is used to convey meaning, but the current definition is ambiguous. In common parlance, many would not include text in the definition of “visual details,” and what is “important” is exceptionally subjective. • For example, some might not consider text on the screen that identifies a speaker (by name and/or position) to be “important.” Yet this information provides context and assists the listener in identifying voices. Additionally, the text is obviously somewhat important because the video creator has taken extra steps to add the text to the video (text overlay). • In other cases, text is the entire subject of the video, such as in the increasingly-popular trend of holding up a series of hand-written signs instead of speaking in the video. In these cases, the text on the handwritten sign is the core of the video. While this type of type would likely be audio described under WCAG (the video is often either silent or playing music in the background), a better, more inclusive definition of audio description would make the need to voice these messages clear and unambiguous. • Many videos also use documents to convey meaning, but the detail needed to adequately convey the meaning depends on the way the video uses the graphs or graphics for the sighted. Sometimes the graph is conveying a broad theme, and a general description will suffice: “Line graph shows the price of eggs getting higher over time.” In other cases, the graphic is used to convey specific information, and those graphics require audio description in order to make that information accessible (for example, each area of a Venn diagram should be described). • In some cases, the documents (or graphics with text) used in the video are substantially or wholly text-based. This is a frequent occurrence with informational presentations (including educational content). In these cases, the document should be read aloud in its entirety. If portions of the document are “not important,” then they should not be displayed at all. “Accessibility” demands parity of access to information, so all consumers of the video must have that access. [#Audio Description (recommended amendment)]() A better definition of audio description would differentiate incidental text (such as the multiple restaurant signs) from intentional text (that which is added to the video (text overlay) or constitutes the subject of the video shot). Here is my suggestion (additions in ALL CAPS): audio description