w3c / wai-media-guide

11 stars 41 forks source link

Terminology - Audio Description / Video Description / Described Video #75

Closed shawna-slh closed 5 years ago

shawna-slh commented 5 years ago

I reconsidered the overall terminology and consulted with others.

For some people (maybe most people) who don't know differently,
"audio description" would be a description of the audio.
(For example, "the metal music had highly amplified distortion, an extended guitar solo, and emphatic beats". ;-) However, WCAG uses
"audio description" to mean description of the visual information,
- which is completely confusing to people not in-the-know (and even some of us in-the-know).

FCC, CVAA, AFB uses "video description".

Canada uses "described video".

What users need is "description of the key visual information in the video" – including:

  1. synchronized with the video. The most common way to do this is through audio.
  2. separate, in text (descriptive transcript).

The issue is how do we say that in a way that is more succinct and is understandable to people who are familiar with different and opposite terminology, and to newbies?

This draft now uses "Audio Description of Visual Information" for the first reference on a page. After that it mostly uses just "description". Some places referring to a specific video uses "described video" and "described version of the video".

This is an effort to be most understandable and least confusing to:

Audio Description of Visual Information

eoncins commented 5 years ago

+1 to terminology proposed Thanks Shawn for the hard work. From an European perspective we are used to AD "audio-description" term. To me "the metal music had highly amplified distortion, an extended guitar solo, and emphatic beats" would be non-verbal information which is audio with no text and non visual content. I am ok with the outcome but if you consider that is relevant to bring it to the EOWG I also agree.

yatil commented 5 years ago

I think this is a good solution.

a11ycob commented 5 years ago

I don't recommend this approach at all, to be honest.

Although descriptions are referred to in various ways depending on region, the most straight-forward way to introduce this section would be as Video Descriptions - due to the specificity of the terms together. From that point in the document you can then explain the various terminology and it's applicability to region and then generally refer to them throughout that page as "descriptions". Otherwise we are essentially adding yet another way of referring to descriptions, further adding to the confusion. Not to mention it is very wordy.

This is also consistent with the approach taken in the MAUR - which I think is important.

eoncins commented 5 years ago

The fact is that we have a terminology issue which has to be recognized. Different terms are being used for the same accessibility service "audio description", "video description", "described video".

ISO which is an International Standard Organisation uses the term "audio description" https://www.iso.org/obp/ui/#iso:std:iso-iec:ts:20071:-21:ed-1:v1:en

This is similar to what happens with subtitles/captions. The problem is that if you are consistent with the approach taken in the MAUR, you will be ignoring other terms that are being used in other parts of the world outside the US.

a11ycob commented 5 years ago

I understand your point, and am very familiar with the ISO standard. However, we have a scoping issue at hand. Audio description as a term also refers to the description of live theatre and other such activities. The W3C, and our work with the EO by extension, is only concerned with activities related to the web and therefore digital content such as video. Introducing the page with the terms Video Descriptions provides the specific scoping which addresses that issue; at which point we can break down the terminology and explain the various ways in which descriptions are referred to.

I feel strongly about not creating another designation. "Audio Description of Visual Information" does not address the scoping issue that I mentioned. Also, when I read that I find myself asking how it differs from Audio Description in the first place which (to me) adds further confusion.

eoncins commented 5 years ago

I also understand your point and I agree about the importance to keep a coherence with other available resources. In fact, there is already an open issue about the term within the available W3C resources. In WCAG 2.0 / 1.2.3 Audio Description or Media Alternative (Prerecorded).

If the aim of the resource is to cover a worldwide notion of this accessibility service, then in Europe the used term is audio description (https://fra.europa.eu/en/publication/2014/indicators-right-political-participation-people-disabilities/audiovisual-standards) Furthermore, as English speaking countries you also have Australia and interestingly they use the term "audio description" and "captions".

In addition, no matter if audiovisual contents are on the web / TV or cinema. Everything is turning into the web and if you have a live online streaming of a theater play or a conference via YouTube or Facebook live then it also applies to the web.

a11ycob commented 5 years ago

I also understand your point and I agree about the importance to keep a coherence with other available resources. In fact, there is already an open issue about the term within the available W3C resources. In WCAG 2.0 / 1.2.3 Audio Description or Media Alternative (Prerecorded).

If the aim of the resource is to cover a worldwide notion of this accessibility service, then in Europe the used term is audio description (https://fra.europa.eu/en/publication/2014/indicators-right-political-participation-people-disabilities/audiovisual-standards) Furthermore, as English speaking countries you also have Australia and interestingly they use the term "audio description" and "captions".

In addition, no matter if audiovisual contents are on the web / TV or cinema. Everything is turning into the web and if you have a live online streaming of a theater play or a conference via YouTube or Facebook live then it also applies to the web.

If AD is an all encompassing term why add "of Visual Information"? Isn't that redundant?

I also feel that I haven't adequately explained my original point. When I use the two individual terms Video + Descriptions, not to be mistaken for the singular term Video Description used in the US, it is simply to explicitly state the scope of the reference at hand, not to give priority to North American terminology. This is what I was referring to when I referenced the MAUR. They use the terms video descriptions as a general, catch all phrase.

Lastly, in Canada, Audio Description refers to:

...a program host or announcer to provide a voice-over by reading aloud or describing key elements of programming, such as text and graphics that appear on the screen. It is often used for information based programming, including newscasts, weather reports, sports scores, and financial data. Most broadcasters are required to provide audio description.

I don't agree with our regulator's definitions, but if we're talking about equalizing the terminology there are always going to be gaps.

eoncins commented 5 years ago

In-line:

<If AD is an all encompassing term why add "of Visual Information"? Isn't that redundant?>

I think that adding "of Visual Information" might/will avoid confusions or misinterpretations.

<I also feel that I haven't adequately explained my original point. When I use the two individual terms Video + Descriptions, not to be mistaken for the singular term Video Description used in the US, it is simply to explicitly state the scope of the reference at hand, not to give priority to North American terminology. This is what I was referring to when I referenced the MAUR. They use the terms video descriptions as a general, catch all phrase.>

Ok sorry for the misinterpretation. I understood that you were favoring the US term.

<Lastly, in Canada, Audio Description refers to: ...a program host or announcer to provide a voice-over by reading aloud or describing key elements of programming, such as text and graphics that appear on the screen. It is often used for information based programming, including newscasts, weather reports, sports scores, and financial data. Most broadcasters are required to provide audio description.>

Textual references and images that appear in the screen are also part of the audio description script.

a11ycob commented 5 years ago

I think that adding "of Visual Information" might/will avoid confusions or misinterpretations.

My contention is the opposite. If you have to resort to adding superfluous words to an existing term then you lack specificity. Hence why I believe Video Descriptions is the most concise means of labeling this page resource. It also does not preclude us from addressing the nuance of the topic in the resource itself.

Textual references and images that appear in the screen are also part of the audio description script.

Right, but the CRTC (regulator) does not encapsulate descriptions singularly through the term audio description. In their eyes they're two separate things and labeled as such.

The argument, as I understand it, is that audio description should be used as the catch-all term in this resource. However, because it is such a broad term, as you have illustrated through your retorts, it lacks the adequate specificity and scoping for this resource. This has resulted in having to tack on additional ambiguous terms to clarify, which to me proves out my point and muddies the waters even further.

If we were talking more broadly I would completely agree with you. In fact, it is my intention to work with our regulator in Canada to adopt the more widely used term audio description. Today, we are discussing the W3C resource. With that in mind, I don't support the current approach.

Anyway, I've left my two cents on the counter. The rest is up to the group.

shawna-slh commented 5 years ago

Hence why I believe Video Descriptions is the most concise means of labeling this page resource.

Personally I think the most accurate terminology is "Visual Description". I'm just not confident that we should use that phrasing since so many people are used to different terminology. Although I could easily be talked into it if others felt it was OK! :-)

Barring that, I think "Audio Description of Visual Information" is the most clear to the wide range of readers of this resource. Please re-read the first comment of this issue. :-)

a11ycob commented 5 years ago

Hence why I believe Video Descriptions is the most concise means of labeling this page resource.

Personally I think the most accurate terminology is "Visual Description". I'm just not confident that we should use that phrasing since so many people are used to different terminology. Although I could easily be talked into it if others felt it was OK! :-)

Barring that, I think "Audio Description of Visual Information" is the most clear to the wide range of readers of this resource. Please re-read the first comment of this issue. :-)

Shawn, my concern is that we are addressing an issue of excessive terminology by adding additional terminology. Taking the term Audio Description and tacking redundant words to the end does not address the issue at hand. In fact, I believe we're adding to the problem. There is no time where descriptions are not addressing the visual narrative so I don't understand how the additional terms are bringing any further clarity?

If there is a compromise to be had here then just label the page Audio Description. I can at least live with that.

shawna-slh commented 5 years ago

There is no time where descriptions are not addressing the visual narrative ...

Right. You know that because you are an expert in the field.

My brain processes "audio description" as description of the audio information (even though I know differently) - and some other people's brains process that, too. But that is not at all what it is.

We want to especially clarify for non-experts.

shawna-slh commented 5 years ago

I did some "quick-n-dirty" user research. I endeavored to be unbiased. I found it very informative.

Findings mostly did not support my hypothesis on one point. [1 in Summary]

Findings fully supported my hypothesis on another point. [2 in Summary]

To re-iterate: this was very limited – and useful.

Summary:

  1. 4 out of 5 said "audio description" is "a person describing", "sound", "narration".
    1 of 5 said "audio description" is a description of audio information -- like birds chirping -- included in captions.

  2. 5 out of 5 gave a much more confident, accurate (and usually more succinct) explanation of "audio description of visual information" than they gave of "audio description".

Note: Primary goal was to get feedback on "audio description" versus "audio description of visual information". While at it, gathered some input on "descriptive transcripts".

Tangential finding: Maybe need to emphasize in our resource more that description is of the important information for understanding the content, not overly-detailed "story"…

5 participants (who all know I work in the accessibility field):

Note: Order of questions was different among participants.


  1. [MK is fairly familiar with W3C work including accessibility, at a non-technical level.]

Recording: https://mit.webex.com/mit/lsr.php?RCID=1f2a39875bc24746aa504be392590f8a

Full transcript (rough):

SLH: Ok, hopefully it's recording. Is that okay?

MK: Yes.

SLH: I want to ask you a couple of questions about videos on the web and making them accessible to people with disabilities. (And I know you don't have a background in that, and that’s OK, don't worry about it.)

SLH: The first question I have for you is: What are captions?

MK: Captions are the words that appear on the screen while you're watching the video that represent the dialogue of the characters.

SLH: OK. What is audio description?

MK: Audio description is the enhancement that they say like if somebody's whistling or birds are chirping and they tell you that on the screen.

(meaning: Audio description is the enhancement that they [put] like if somebody's whistling or birds are chirping and they [put] that on the screen [with captions].)

SLH: OK. What are transcripts?

MK: Transcripts are the actual language, the full dialogue of conversation.

SLH: OK. What is sign language?

MK: Sign language is the physical representation through gestures and hand signals and shapes of communicating language for people who are Deaf.

SLH: OK. What is audio description of visual information?

MK: Audio description of video information – OK, I've seen or experienced that as someone, like a voice-over, describing what's going on in a scene or what’s going on on a stage, performance.

SLH: OK. I'm going to stop the recording.


  1. [LD is "common person on the street"]

Recording: https://mit.webex.com/mit/lsr.php?RCID=759c244764b14eeeb1072cc33b5534ba

Transcript excerpt:

SLH: What is sign language?

SLH: What is audio description?

LD: Sound describing something.

SLH: What?

LD: … ah… audio description – a person telling or describing a thing, like an event.

SLH: What are captions?

SLH: What are transcripts?

SLH: What is audio description of visual information?

LD: It is somebody telling you or someone about what they're seeing on a screen.


  1. [ET is "common person on the street". Uses Voice-Over for final proof of some writing.)

Recording: https://mit.webex.com/mit/lsr.php?RCID=bbace8bf49984ef594a4bb89a4f78276

Transcript excerpt:

SLH: What is sign language interpretation?

SLH: What is audio description?

ET: Um, a person, it would be, ah, audio, sound, ah, for a person who is sight-impaired or completely blind.

SLH: What are captions?

SLH: What is audio description of visual information?

ET: Um, it would be the audio description as I said before but it would talk about – it would give almost a story format to describe something that's going on on the screen.

SLH: What are transcripts?

SLH: What are descriptive transcripts?

ET: Again, printed text that ah uses words to describe a scene or actions or combination of that.


  1. [LS is a software engineer who is accessibility-aware and an avid audio book consumer.]

Recording: https://mit.webex.com/mit/lsr.php?RCID=2d13cc54d2274ee88eb7aef257dd761ev

Transcript excerpt:

SLH: What are captions?

SLH: What is audio description?

LS: Ah, describes things in the video that may not be seen by, I guess, for people with sight issues.

SLH: What are transcripts?

SLH: What is sign language?

SLH: What is audio description of visual information?

LS: Same answer as before: attempt to describe what sighted people see visually, describe it for people with sight difficulties.

SLH: What is a descriptive transcript?

LS: It would include both the actual text that was in the video, transcript of people talking, plus the descriptions of what is seen.

...

[recording of follow-up chat: https://mit.webex.com/mit/lsr.php?RCID=546ed24c7cb043e8b3e18d121f2cdde5]

SLH:… thinking of saying "audio description of visual information" the first time-

LS: Yes, certainly that's what it should be for -- for the survey, maybe not -- but for ultimately when you have like a bullet list or whatever of things that people should do or support, then that "of visual information" is really key.


  1. [JR is Director of Marketing Communications in Division of Information Technology at a large university. Is accessibility aware at upper-manager-level.]

Recording: https://mit.webex.com/mit/lsr.php?RCID=debbf2122bc8449cb948c71ef509628b

Transcript excerpt:

SLH: What are transcripts?

SLH: What is audio description?

JR: My understanding of audio description is um, uh, sort of additional commentary or narrative about what is occurring in the video, for example, scenery or time of day or things like that.

SLH: What are captions?

SLH: What is a descriptive transcript?

JR: Descriptive transcript – uoo – this one is a little more of a guess on my part because I'm not familiar with the term, but my guess is that it is um additional commentary that's part of a transcript that goes beyond just the words that are um being transcribed, I mean what's being said in a video, for example, again it would be commentary about what's actually occurring in the background and the foreground of the video.

SLH: What is sign language?

SLH: What is audio description of visual information?

JR: Audio description of visual information, ah, to me that would be an oral narrative of what um is being presented in a video.

yatil commented 5 years ago

I can see @a11ycob’s point and thanks for the informal user research, @slhenry. My biggest concern about this is that we introduce a second term describing WCAG’s “audio description” in W3C. It’s term definition is actually quite succinct and relatively easy to understand.

I see that some readers might interpret “audio description” as “description of audio” and that “of visual information” would clarify it for them. Maybe it helps if we put “of visual information” in parentheses in the header to make clear that this is an addition to explain the phrase?

I personally leave this to Editor’s discretion, as I don’t have a good idea how to address it. Thanks for all the hard work on it.

shawna-slh commented 5 years ago

It’s term definition is actually quite succinct

humm - 82 words seems kinda long to me

yatil commented 5 years ago

It’s “narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone” which is 19 words. Maybe we talk about something different. As I said, happy to leave that to you.

shawna-slh commented 5 years ago

Along with 4 NOTEs, including "Also called "video description" and "descriptive narration.""

shawna-slh commented 5 years ago

Maybe it helps if we put “of visual information” in parentheses in the header to make clear that this is an addition to explain the phrase?

Thanks much for the idea!

To see it:

Audio Description (of Visual Information)

I've been mulling it over. [... time passes ...]

I'm leaning towards the parenthesis making it a bit more complex. And this just more smooth and clearer for people who don't already know the terminology:

Audio Description of Visual Information

I'll continue to let it percolate, and welcome additional perspectives.

yatil commented 5 years ago

[FWIW, I am grading my students at the moment, and despite me describing what audio description is, a number of them says it is text description for the audio. While it is possible that I did a REALLY bad job describing it, it could also point to audio description being easy to misunderstand.]

shawna-slh commented 5 years ago

@yatil Thank you for the additional feedback. This is quite helpful.

(I'll go with you describing it fine, and it just being counter-intuitive for some people. :-)

shawna-slh commented 5 years ago

Hi @a11ycob

Thanks again for the continued consideration and clarifications.

We've had this on the EOWG agenda for a few weeks in case you wanted to discuss it. If so, please let me know for this week (16 August).

Otherwise, we'll go with the additional data supporting the non-expert need for clarification.

shawna-slh commented 5 years ago

Leaving this issues closed -- just adding another data point -- since it's different from any above.

With another person, I did similar questions as user research above. Answer to what is "audio description":

things like file size, format (MP3 or whatever), length/duration