w3c / wcag

Web Content Accessibility Guidelines
https://w3c.github.io/wcag/guidelines/22/
Other
1.05k stars 232 forks source link

How to add audio descriptions to videos? #1768

Open guyhickling opened 3 years ago

guyhickling commented 3 years ago

There has been a long discussion on the WebAIM forum today on this subject. Most of us came to the conclusion that compliance with SC1.2.5 requires that audio description be provided for all video content that is not adequately explained in the audio track, even if there are not sufficient spaces in the dialog to use for that purpose.

The SC quite clearly states "Audio description is provided for all pre-recorded video content ...". It specifically says "all", without any caveats. That is the normative requirement.

Unfortunately the Understanding document then tries to alter that by claiming that this audio provision is to be done "During existing pauses in dialogue....". The informative document effectively tries to restrict the application of the normative SC, in all cases where videos have large amounts of unexplained visual content with no space in the audio for it.

Not only does that contradict or soften what the SC itself says, but it also blandly ignores the fact that about half of all videos ever made have hardly any spaces in the audio for adding extra. It also ignores that many videos - particularly educational and technical ones - often have detailed diagrams or illustrations that require very long spaces to insert a description. It has also led to widespread confusion in the web dev world. Just last week I was auditing an art museum website, with many videos that displayed a painting or sculpture and then discussed it for about 20 minutes, without one of them ever telling blind people what the painting or sculpture showed! I struggled with explaining this SC to them.

Whether website and video owners choose to comply with the SC for already existing videos, or instead claim undue burden, is up to them. But the SC itself should be clear, without having an Understanding doc appearing to say something different. And it should be a basic requirement of the WCAG at Level AA that all video content be fully accessible to blind people in the audio form without having huge chunks of many videos missing or unexplained or not described to them. That should never have been relegated to AAA in the first place; it's such a basic right!

So to solve this contradiction (not to mention to help ensure that blind people have full access to all video content), could I suggest that a green Note be added to the Understanding doc for 1.2.5. It need not contradict what is already said in the Understanding document. I recommend a wording something like:

Where there is not sufficient space in the existing audio track to insert a particular needed item of audio description, space should be created to allow the insertion. This can be done by freezing the currently displayed video frame for the duration of the audio insertion, or by inserting additional video frames if preferred.

NB: This will, of course, make the AAA SC1.2.7 almost redundant since it says a similar thing. But we should just accept that. Better a redundancy at AAA level (which hardly any website follows anyway), rather than retain this confusing contradiction of 1.2.5.

patrickhlauke commented 3 years ago

The quick fix to understanding here is, I think, to remove "During existing pauses in dialogue", so the sentence there starts with "Audio description provides information about actions..."

An extra sentence can then be added to explain that AD can be added in exiting pauses, where available, or that appropriate pauses need to be created (freeze-framing the original video to then provide sufficient time for AD) (edit: but yes I see what you mean about making 1.2.7 redundant then)

patrickhlauke commented 3 years ago

Alternatively, if it's felt that the "during existing pauses" part is implicit already in the use of the term "audio description" - which is defined in the normative glossary https://www.w3.org/TR/WCAG/#dfn-audio-descriptions with a note that does talk about the use of existing pauses, to differentiate it from the normative definition of "extended audio description" https://www.w3.org/TR/WCAG/#dfn-extended-audio-description - then the understanding needs to make it clear WHEN content can claim not to have sufficient existing pauses

guyhickling commented 3 years ago

Yes, if we can actually amend the Understanding document to remove those words as you suggest, that would be much the best way to go, rather than just adding a Note.

awkawk commented 3 years ago

This is completely contrary to the intent of 1.2.5 as written. The SC does indicate that "audio description is provided for all prerecorded video" - you can read that as "all video files have audio description" or "all content in video has audio description" but the definition of audio description indicates "narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone" - there is clear subjectivity in the definition. Could some people say that there is additional content in a video that hasn't been described in just about every case? Absolutely.

This is the way that audio description worked in 2008. The majority of extended audio description examples available then were proofs of concept from NCAM, and while a great idea in concept has the ability to extend the duration of a 10 minute video to be a 30 minute video if the content is complex. There was never any thought that extended audio description would be covered under 1.2.5, and if viewing that SC on its own (along with the clarifying notes in the definition for audio description) isn't enough the fact that 1.2.7 clearly requires extended audio description when the pauses don't provide enough time for more complete descriptions certainly makes this clear.

In my opinion, this proposal is a non-starter as a proposal to change WCAG 2.x and should be reviewed for 3.0.

patrickhlauke commented 3 years ago

so andrew, are we saying that if a video with audio track doesn't have any usable gaps in the existing audio, it's kosher to say "can't add AD here, since there's no gaps"? and if that is a valid way to bypass 1.2.5, can that be explained in understanding?

bruce-usab commented 3 years ago

if a video with audio track doesn't have any usable gaps in the existing audio

That follows from our definition of AD. But I disagree with the characterization of bypass 1.2.5 since the difficulty follows from conventions of AD. I agree with @awkawk that the quality/sufficiency of the non-stop audio track (from the perspective of someone not watching the video track) could be something to addressed by 3.0.

awkawk commented 3 years ago

@patrickhlauke I do think that it is ok to say something like that, but there are two ways to add audio description to a video without extending the overall timeline - one is to fit AD into the gaps in the audible content and the other is to duck the program audio to allow AD to be heard. In the latter someone needs to make an informed choice about what best serves the needs of users - hearing the primary audio without description, or missing some of the primary audio in order to hear more description. Clearly this will be made on a case-by-case basis. Quality of description is like the old debate on what is the best alternative text for an image, except more difficult.

I haven't looked at the understanding document for this SC in a long time, but it sounds like additional information might be helpful to clarify.

guyhickling commented 3 years ago

non-starter as a proposal to change WCAG 2.x

No one is suggesting changing the WCAG. SC1.2.5 will remain as it always has. We are merely highlighting a contradiction between the clear statement in the normative SC and a comment in the non-normative Understanding document. It leads to questions being raised like the one on WebAIM on Friday (which I should have given the link to, sorry - it's https://webaim.org/discussion/mail_message?id=46062). (It has also implied a cop out for video makers who don't want to make the effort.) It's long overdue for clarification.

It's true that the Understanding document doesn't actually say what to do when existing pauses aren't sufficient, hence the confusion. But just because something was interpreted badly in 2008 on account of that comment in the Understanding is no reason for perpetuating that understanding forever. We should be far more concerned about blind people today - that's what the WCAG is for.

audio descriptions.....has the ability to extend the duration of a 10 minute video to be a 30 minute video if the content is complex

If that's what it takes, then do it! This is true of probably the majority of educational and technical and documentary videos. I'm sure blind people still want to be educated even if their version of a video is much longer than the standard version. (The extended version is only an alternative, for blind and visually impaired people, the original version for sighted users remains the same length as before.)

Interpreting the Understanding document in a way that denies the right of blind people to enjoy ALL videos in the best way possible is, as Steve Green put it very well in his original WebAIM post, simply perverse.

awkawk commented 3 years ago

@guyhickling

This seems to stem from the claim that the Understanding document 'tries to alter that by claiming that this audio provision is to be done "During existing pauses in dialogue....".' This is not, in my view, an accurate interpretation and nor does it square with the fact that the WG published the original understanding document with the same text (https://www.w3.org/TR/2008/NOTE-UNDERSTANDING-WCAG20-20081211/media-equiv-audio-desc-only.html). If this was not the intent, the WG would not have said so, and would not have created a more demanding SC at AAA.

I understand that you are saying that no one is suggesting a change to WCAG, but what you are suggesting is changing the understanding document in a way that will merge the AA and AAA SC, and this is not appropriate to do.

If that's what it takes, then do it!

In some cases this is the right decision and absolutely should be done to support users as well as possible. However, that doesn't make it required by WCAG 1.2.5.

Interpreting the Understanding document in a way that denies the right of blind people to enjoy ALL videos in the best way possible is, as Steve Green put it very well in his original WebAIM post, simply perverse.

It is important to separate what is required by a standard and what is helpful for people. Any content provider can go beyond AA and provide extended audio description. If someone wants to say that it is perverse to not provide XAD since end users are impacted, fine, but I disagree that it is perverse to interpret WCAG SC in the way that they were written.

patrickhlauke commented 3 years ago

@bruce-usab

That follows from our definition of AD. But I disagree with the characterization of bypass 1.2.5 since the difficulty follows from conventions of AD

@awkawk

there are two ways to add audio description to a video without extending the overall timeline - one is to fit AD into the gaps in the audible content and the other is to duck the program audio to allow AD to be heard

If I created a video with audio and made sure during production never to leave even the slighted gap, and if all of the audio was important so couldn't be ducked...would I be able to claim this as a pass/not applicable, thus bypassing the SC altogether on a technicality? i.e. if the above is true (no gaps, AND all audio is important enough that ducking part of it would lose information), can a video be failed or does it necessarily have to pass/ n/a ? and if yes, can that be explicitly stated in the understanding?

in practice, is there effectively an exception here in the SC (due to the way AD is defined) that says videos with audio that has no gaps and where all the audio is deemed essential and can't be ducked are exempt from meeting this criterion simply because they can't be AD'd per the definition of AD?

awkawk commented 3 years ago

@patrickhlauke I would say that is an extreme example, but that it wouldn't fail.

sajkaj commented 3 years ago

The W3C Note on Media Accessibility User Requirements (MAUR) contemplates there are situations where the video may be paused for the end user to finish reading the description. If memory serves this use case was proposed by WGBH for educational applications. Clearly, such an approach wouldn’t work for entertainment—though the same video could be treated either way, e.g.

One might not want interruptions while watching Macbeth with the family (entertainment);

But one might very much appreciate the greater depth of description while studying Macbeth at the University.

Best,

Janina

PS: The MAUR is here:

http://www.w3.org/TR/media-accessibility-reqs/

From: Patrick H. Lauke @.> Sent: Monday, April 26, 2021 5:56 PM To: w3c/wcag @.> Cc: Subscribed @.***> Subject: Re: [w3c/wcag] How to add audio descriptions to videos? (#1768)

@bruce-usabhttps://github.com/bruce-usab

That follows from our definition of AD. But I disagree with the characterization of bypass 1.2.5 since the difficulty follows from conventions of AD

@awkawkhttps://github.com/awkawk

there are two ways to add audio description to a video without extending the overall timeline - one is to fit AD into the gaps in the audible content and the other is to duck the program audio to allow AD to be heard

If I created a video with audio and made sure during production never to leave even the slighted gap, and if all of the audio was important so couldn't be ducked...would I be able to claim this as a pass/not applicable, thus bypassing the SC altogether on a technicality? i.e. if the above is true (no gaps, AND all audio is important enough that ducking part of it would lose information), can a video be failed or does it necessarily have to pass/ n/a ? and if yes, can that be explicitly stated in the understanding?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/w3c/wcag/issues/1768#issuecomment-827170470, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANUVHWRDXVA67KQQOTMN3ELTKXOPXANCNFSM43PRHVBA.

guyhickling commented 3 years ago

@awkawk You claim my interpretation of the SC is not correct. So can you please explain how the words of the SC "Audio description is provided for all pre-recorded video content....." - which clearly states a simple, blanket requirement that AD is provided for all video content without any exceptions or caveats - how that can be reconciled with an interpretation that this ONLY has to be done where a sufficiency of pauses in the audio allows it? That simply is not a logical equivalence. The latter statement, if interpreted your way, places a restriction on the scope of the normative SC.

(And no, your earlier comment above that it could mean "all video files have audio description" doesn't hold water either, because it is self evident, and evident also from the definition of audio descriptions, that if there are no "important visual details that cannot be understood from the main soundtrack alone" then there is no need for - indeed, there cannot be - audio descriptions.)

In other words, I would like to know how you explain that a blanket normative requirement can suddenly become restricted by another statement, in a document that is informative only?

bruce-usab commented 3 years ago

@guyhickling one of the questions that came up pretty early was: Does 1.2.5 require a dedicated AD track? People familiar with AD in theater and movies did not seem to be comfortable with WCAG appropriating and redefining their vernacular. It worked out. You are trying to make the SC mean something that it does not.

awkawk commented 3 years ago

@guyhickling How do you reconcile the difference between 1.2.5 and 1.2.7?

guyhickling commented 3 years ago

@awkawk, you answer my question and I'll answer yours :-)

guyhickling commented 3 years ago

@bruce-usab, you are trying to apply the WCAG to something it doesn't apply to, but supposing for a moment theatres were in scope. Where outlets are already providing their own AD (and assuming it does the job properly), there would be little reason to bring the WCAG into it, would there? The WCAG (and this particular 1.2.5 part of it) is there to guide and instruct all the billions of websites and online videos that don't provide adequate AD for blind people, not to change those (relatively few) that do! The theater vernacular is quite safe!

And no, whether it is done by a dedicated track (if the video players involved can handle that) or an alternative copy of the video, or by some new technology that may be invented in the future, is not mandated. What 1.2.5 does require (it says so!) is audio descriptions to explain important visual information to blind people.

That's all 1.2.5 asks for, that blind people be allowed the same right to enjoy all video content as sighted people without, all too often, huge chunks of the most important bits being hidden from them because no one wants to make the effort. I do not see why everyone wants to use every argument they can possibly think of, including spurious arguments like theatres for goodness sake, to stop this basic human right being granted to blind and visually impaired people!

awkawk commented 3 years ago

@guyhickling My answer is that audio description requirements in 1.25 and 1.2.7 make it clear.

bruce-usab commented 3 years ago

@guyhickling I am not trying to apply WCAG outside of its domain, I am trying to provide you some context.

That that blind people be allowed the same right to enjoy all video content as sighted people is a reasonable aspirational goal, but not a testable statement structured like a 2.x SC.

Over the years, I have frequently found it to helpful that AAA requirements are in the same document as the AA and A requirements. For example, when someone asserts that WCAG 2.0 Level AA requires that web pages use H1 -- H6 in some particular way, I point them to SC 2.4.10. Your assertion that 1.2.5 requires extending the audio track is refuted by the existence of AAA requirement 1.2.7.

mraccess77 commented 3 years ago

One way to think of this is to replace the word "audio description" in the SC with the definition and note. Then effectively it reads something like the below which combines the definition and note:

1.2.5 Audio Description (Prerecorded): narration is added during existing pauses in dialogue. (See also extended audio description.) and is provided for all prerecorded video content in synchronized media added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone (Level AA)

The idea is that audio description is strictly defined as narration added during the existing pauses - anything outside the pauses is not "standard audio description" and thus not covered.

patrickhlauke commented 3 years ago

in which case, are we agreed that normatively, from that definition of AD, it follows that where there are no existing pauses in dialogue, the SC does not apply? (noting that if a company/individual decides to mark a video as passing or not applicable for 1.2.5 under the rationale that "it doesn't have existing pauses", then they need to be prepared to demonstrate that if it comes before a judge... so they'd better be able to demonstrate that adding AD was indeed physically impossible)

awkawk commented 3 years ago

There is no "not applicable" in terms of SC being evaluated for a page, but I would agree that an evaluator might (in rare cases) indicate that the page satisfies SCS 1.2.5 without adding audio description to a specific video because there is no time in between necessary dialog to insert description content and that all dialog is critical to the understanding of the content so none of the dialog can be ducked in volume to create space for a description.

It is far more likely that the description is less complete than ideal, because the spaces between dialog are too short to allow a thorough description.

alastc commented 3 years ago

For the original question:

SC1.2.5 requires that audio description be provided for all video content that is not adequately explained in the audio track, even if there are not sufficient spaces in the dialog to use for that purpose.

To summarise into a DRAFT proposed response:


SC 1.2.5 requires audio description for videos with visual information. That could be achieved with the default voice over, by adding description in gaps in the audio, or by ducking the default audio if the description is more important.

There are valid reasons a video with visual information may not have an audio description: Where it is a media alternative for text (part of the synchronised media definition), or where there are no gaps in the default audio that could be ducked because the default audio is providing more important information. That is inherent in audio descriptions as they require gaps in some form.

SC 1.2.7 Extended Audio Description requires that the track is extended to fit in description, 1.2.5 does not.

patrickhlauke commented 3 years ago

This - or words to that effect - could really do with being added to the 1.2.5 understanding document in a fairly explicit way. Happy to add a PR to the pile.

alastc commented 3 years ago

@patrickhlauke it's hard to keep up with which PRs apply to which issue, but if you can add a "Survey - Ready for" label to the issue, that's my queue for adding to meeting agendas.

mraccess77 commented 3 years ago

Regarding the proposal with audio ducking which only seems to have come up recently - this is potentially an area where there is subjectivity is what are important details- but what I think most would agree is that it's ok to describe over music or sound effects when essential and if ducked. Description Key has some conventions that it might be helpful for people to review:https://dcmp.org/learn/captioningkey/617

patrickhlauke commented 3 years ago

@patrickhlauke it's hard to keep up with which PRs apply to which issue, but if you can add a "Survey - Ready for" label to the issue, that's my queue for adding to meeting agendas.

if the PR includes the "Closes #XXXX" bit, there'd be a clear connection between issue and PR...but sure

patrickhlauke commented 3 years ago

also, just going by https://github.com/w3c/wcag/labels/Survey%20-%20Ready%20for you'd see everything that has that label (regardless of whether it's an issue or a PR)...

bruce-usab commented 3 years ago

Thanks @mraccess77 for that Description Key resource . You do bring up a gray area in the requirements, but one that I have not encountered as a problem in actual practice. Right now we have:

Note 2: In standard audio description, narration is added during existing pauses in dialogue

This could probably be something a little stronger, perhaps:

Note 2: In standard audio description, narration is added during existing pauses in dialogue, and often includes lowering the volume of background music or other non-informational sounds.

patrickhlauke commented 3 years ago

Made a first stab at clarifying things https://github.com/w3c/wcag/pull/1790 - i think i've actually come around to the idea that if there's no gaps, and it's not possible to create these gaps by ducking/omitting existing audio, then the content straight up fails. yes, it's technically not possible to do a "straight" AD in those cases...but the requirement of the SC stands regardless, and if content can't be made to meet it...then it needs to be replaced with content that can.

alastc commented 3 years ago

Um, the last bit (failing even when there are no gaps) is the opposite of the conclusion I was proposing above.

That's not necessarily a problem, the rest of the PR works either way, and I can put that as the question to the group. We're having to drawn a binary line on a continuum...

patrickhlauke commented 3 years ago

saying that a video with audio but no AD can pass this SC if it has no gaps that would make retrospectively adding AD impossible seems ... illogical though. the content fails. if it's too hard to now retrofit in AD and videos will have to be redone, that's not a concern of whether it passes or fails, no?

otherwise, the same argument could be made for other things...the video was done, contains lots of text, but has super-low contrast. can't be fixed unless the video is redone. does it get a pass under contrast requirements?

absolving failing content because it might be costly to fix after the fact (redoing the video, and leaving appropriate gaps for some AD if needed, or re-doing the dialogue to narrate what is visually happening) would seem to set an uncomfortable precedent.

patrickhlauke commented 3 years ago

However, if the group decides that no, a video without gaps simply can't have AD because by definition AD fits in gaps, and the video has no gaps, then the note in the proposed PR https://github.com/w3c/wcag/pull/1790 should be changed to say that. Explicitly saying that the content would then pass, and why. (and i'd suggest making comments/changes on the PR, rather than here, for that) - see https://github.com/w3c/wcag/pull/1790/files#r630657270

alastc commented 3 years ago

absolving failing content because it might be costly to fix

It isn't so much a cost thing, it is more than an adjustment of the content as originally envisaged. For A/AA the SC can require things to be adjusted, e.g. contrast increased, structure improved, etc.

At AAA the SC can ask for more fundamental changes to the content, e.g. the AAA audio desc requirement, or changes to line length (impacting design) etc.

Asking authors to change the intended way the content should work is more of a AAA thing.

I realise that's a blurry line, you could envisage the content differently to start with, but I've come across quite a few videos that have a constant voiceover that doesn't explain everything visible.

patrickhlauke commented 3 years ago

but I've come across quite a few videos that have a constant voiceover that doesn't explain everything visible.

still not convinced by the reasoning (if the ask is "it needs AD", i can't see how it's legitimate to then say "it passes, because there's no gaps for AD") ... other than the only reason being that otherwise it cuts across too much of the AAA SC's turf. but as said, if this is the consensus, i'd still want to see that in a note, made explicit in the understanding document (as per my suggested comment on my own PR).

edit: and again, this doesn't really answer the other thought experiment...what if a video has text with low contrast...we'd fail the video for THAT, even though the ask on the author (to redo the video) is the same level of effort involved in asking them to change it to allow for AD gaps...

alastc commented 3 years ago

if this is the consensus, i'd still want to see that in a note,

Marked for survey.

this doesn't really answer the other thought experiment...what if a video has text with low contrast...

That is what I tried to answer above (for me at least). I do see that as different. It wasn't the level of effort, it is the changes from the content as originally envisaged. Changing the contrast, whilst a lot of effort after a video has been exported, doesn't change what you are trying to say or how you are saying it.

Removing chunks of voiceover or extending the video is a pretty fundamental change to the content that we generally don't require at AA.

bruce-usab commented 3 years ago

@alastc do you happen to have any URLs examples of videos that have a constant voiceover that doesn't explain everything visible.

I am happy to evolve my thinking on this issue, but my impression is that these situations mostly come down to quality. And if there is a constant voiceover, that is usually good enough.

OwenEdwards commented 3 years ago

Small point: @alastc wrote:

SC 1.2.7 Extended Audio Description requires that the track is extended to fit in description, 1.2.5 does not.

The word "track" means something else in HTML; I recommend "video" or "synchronized media". Or even, "duration of the video" or "duration of the synchronized media", to clarify that this means extending the duration and not extending the height or width (it's obvious to me, but 🤷‍♂️).

OwenEdwards commented 3 years ago

As a separate and broader point from my small one above, I'd like to point out my demo of "hybrid" text-to-speech description:

http://www.ca11y.com/videojs-speak-descriptions-track/

I wrote more about it here: https://github.com/w3c/wcag/issues/1042#issuecomment-841916714

A few important points:

  1. This "Hybrid description" attempts to announce the description cue's text in the time available to it as defined by the start and end time of the cue in the text track (note that this timing is defined by the author of the track, so it defines the available time during the video, even if that time is much too short for the text to be spoken in normal circumstances). But,

  2. A user control could easily be provided (it's not in this demo, but could be added) so that the playback picks one of several strategies if the text-to-speech of a cue would not fit into the available time defined by the start and stop time in the text track; possible strategies could include:

a) Pausing the video, finishing the speech of the text of the cue, and then re-starting video playback. Note how this is a hybrid of "inline" and "extended" description, since as much of the description is announced inline as possible, and the video is only paused ("extended") for as short a time as necessary to finish the description cue. This video is not paused for all of the description cue.

or

b) Increasing the speed of the speech produced by the text-to-speech so that it fits into the gap available as defined by the cue in the text track. This needs actual user research as to how effective it is, and how much it is preferred, but given the speech rate that many screen reader users choose, it may turn out that those same users would much rather have the text spoken at a faster rate than the author of the description text track realized. This allows users to choose faster speech instead of extending the overall timeline of the video.

Also note that, in this demo, ducking of the main audio is always used when the text-to-speech description speaks over the main audio track, but the system could be modified so that something in the text track could indicate whether the main audio needs to be ducked or not. There could then also be a user option as to whether to duck only when the text track recommends it, never, or always.

I bring all this up because I think the distinction between "SC 1.2.7 Extended Audio Description requires that the [duration of the video] is extended to fit in description, 1.2.5 does not" is an artificial distinction. Those two used to be the only options, but this "hybrid" description adds another option that can meet the need to add audio description to videos which have very little/few "existing pauses in dialogue" while allowing the user, not the author, to decide whether to use ducking, faster speech, and/or pausing the video to hear the additional audio description that they need to understand all the content of the video. It can even work for videos that have no "existing pauses in dialogue", but it would always extend the video rather than doing a mix of inline and extended description. But none of this requires re-editing the video; it's all a feature of the player, once the audio description text track has been authored appropriately.

mitchellevan commented 2 years ago

As much as I would love to swing the WCAG AA hammer for the benefit of users, this discussion convinces me we must interpret "added to the soundtrack" to mean "added to the existing soundtrack," constrained by its existing pauses and ducking opportunities.

...extending the video is a pretty fundamental change to the content that we generally don't require at AA

This is a good reminder that AAA doesn't mean "less important for users." Extended audio description is AAA for other reasons such as feasibility for all topics and types of content, unreasonable difficulty, or potential impact on design. Editing a video to insert pauses is not difficult. However, some video topics convey lots of visual information under a total runtime constraint, making a blanket requirement of extended audio description infeasible.

As users and advocates, we should not be shy about making the case for going beyond AA, especially for content where it's barely more difficult.

bruce-usab commented 11 months ago

As users and advocates, we should not be shy about making the case for going beyond AA, especially for content where it's barely more difficult.

+1 to this

As much as I would love to swing the WCAG AA hammer for the benefit of users, this discussion convinces me we must interpret "added to the soundtrack" to mean "added to the existing soundtrack," constrained by its existing pauses and ducking opportunities.

Yes. I agree that word "added" is confusing, since it might merely refer to "non-stop existing dialog". It can be the case that no voice-over is added but yet the video passes against SC 1.2.5.

bruce-usab commented 11 months ago

@guyhickling @mbgower how about a Failure technique describing a video with significant "decorative" non-informative background music, where the author has not provided a version utilizing audio ducking and narration during the music? Would that resolve this question and issue?

guyhickling commented 10 months ago

No, it wouldn't solve it at all. My purpose in raising this question was to point out that blind people are being let down by the WCAG here. (Unusually so, since the WCAG is usually much better than that - this is a single example of less than usual quality.) A majority of videos, especially educational ones dealing with complex subjects and containing lots of images, just do not have enough free space (whether silent moments or just music that could be ducked) to add audio descriptions of those images. (And where spaces exist, they are likely to be in the wrong place, i.e. far away from the image that needs the description.)

My aim was to try to persuade people to think of change, so that it could become part of the WCAG at AA level to provide description of all important stuff shown on screen. It is discrimination, pure and simple, to say to a blind person, "Sorry, pal, there are no spaces in the sound track, so we are not going to tell you what's shown in the video so you just clear off!" - which is effectively what we are saying at the moment.

There are real people, blind people, being left in the lurch here, and we in the accessibility community, who should be helping them, are refusing to do so simply because of the way an SC was written in the WCAG fifteen years ago. The museum videos I mentioned in the initial post were real videos that are totally useless to blind people - and that museum is just one example of the kind of video that is all too common..

This question is not solved by saying "that's the way the WCAG was written", or by saying the technology didn't exist when the SC was written. Nor is it solved by having debates over word interpretations, or by adding a failure technique that merely supports the existing poor outcome. And it isn't solved by ducking other content, which just shifts the problem elsewhere, meaning blind people then won't hear some of the actual audio track or the information contained in it! And it is certainly not solved by the existence of the AAA SC1.2.7 - as I have said many times before, no one, almost no one, applies AAA level in their work. The accepted standard recommended for use is AA, and most country legal systems all reference AA not AAA. Anything at AAA level is just in a sort of limbo that almost nobody looks at. In the dozens and dozens of audits I've done, I have only ever been asked to audit to AAA level just 2 or 3 times.

I want us to change course, and start looking for a real solution, one that will let blind people hear, described, everything shown in a video.

mbgower commented 2 months ago

Draft Working Group Response

The issue opener makes a number of contestable statements, and concludes with a request that cannot be undertaken within the scope of the WCAG 2 Task Force. This response will attempt to address the key points somewhat succinctly.

Statement 1

The SC quite clearly states "Audio description is provided for all pre-recorded video content ...". It specifically says "all", without any caveats. That is the normative requirement.

What is meant by "all"?

It is important to emphasize that what is meant by "all" in this sentence is contentious in this discussion. Does it mean that all videos that are prerecorded need audio descriptions, or does it mean that all the visual content in the videos needs to be included in the audio descriptions? The majority of responders say it is the former.

There is a separate AAA 1.2.7 Extended Audio Descriptions, which addresses situations "Where pauses in foreground audio are insufficient to allow audio descriptions to convey the sense of the video." Logically, if a AAA requirement is specifically created to fill a gap where "standard" audio descriptions are insufficient, not all videos can be met only by the AA requirement. It is clear that the "all" must be discussing all videos, not all video content; otherwise, there would be no case for a AAA requirement.

The working group confirms that the AA SC requires that all videos have audio descriptions, not that every piece of visual information needs an audio description.

(Note: There are two exceptions to "all" videos: 1) those where there is no additional info conveyed visually, as per note 3, quoted in Statement 2 below; 2) and those where "the media is a media alternative for text and is clearly labeled as such," in the normative text of the SC.)

Statement 2

Unfortunately the Understanding document then tries to alter that by claiming that this audio provision is to be done "During existing pauses in dialogue....". The informative document effectively tries to restrict the application of the normative SC

This is not true. The restriction of audio descriptions to pauses in dialogue does not arise solely in the Understanding document. The normative text for 1.2.7 (quoted above in response to statement 1) reinforces that 1.2.5 is concerned with "pauses in foreground audio". As well note 2 of the definition of audio description also makes the objective of "pauses in the dialogue" explicit (I have left out the irrelevant 4th note):

audio description narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone Note 1: Audio description of video provides information about actions, characters, scene changes, on-screen text, and other visual content. Note 2: In standard audio description, narration is added during existing pauses in dialogue. (See also extended audio description.) Note 3: Where all of the video information is already provided in existing audio, no additional audio description is necessary.

The working group confirms that audio descriptions in 1.2.3 and 1.2.5 refers explicitly to narration added to existing pauses in the dialogue.

Issue Request

that a green Note be added to the Understanding doc for 1.2.5. It need not contradict what is already said in the Understanding document. I recommend a wording something like: Where there is not sufficient space in the existing audio track to insert a particular needed item of audio description, space should be created to allow the insertion. This can be done by freezing the currently displayed video frame for the duration of the audio insertion, or by inserting additional video frames if preferred. This will, of course, make the AAA SC1.2.7 almost redundant since it says a similar thing.

This would expand the scope of 1.2.5, as the issue opener notes, making it the same as 1.2.7.

Conclusion

Adding a note that creates a new requirement at the AA level involves a class 4 substantive change, as defined by the w3c process. Such changes are outside the mandate of the Task Force. As the issue opener observes, this would extend the scope of the AA requirement, essentially making it the same as the AAA requirement.

"Audio ducking," which is brought up several times in the thread (not by the issue opener) is not a technique that is captured in the SC, nor is it mentioned in any of the existing documentation. A new technique providing it as a way to meet some of the audio description SCs would be welcome, and a new issue will be opened to create one. However, such a technique could potentially be contested as failing 1.2.5, which through the definition quoted above specifically addresses providing descriptions in the pauses in dialogue. Audio ducking would likely need to be scoped to lowering non-dialogue audio, in order to be a sufficient technique for 1.2.5.

mraccess77 commented 2 months ago

I worry about audio ducking reducing the volume of audio and playing two audio tracks at once for those who are blind/low vision who are also hard of hearing. I would think that any non-overwritable ducking should be limited to non-spoken audio or important sounds.

mbgower commented 2 weeks ago

Since we are returning to this issue in the Task Force, I thought I would put in a summary of this very long thread to try to identify the issues identified and the actions resulting.

  1. Original issue critiquing the interpretation of 1.2.5 and suggestion to modify its wording (and by extension, its scope). ACTION: A draft response has been written that addresses the original issue point by point, which will be reviewed by the Task Force. If the TF agrees with the sentiment, it will be presented to the Working Group for approval.
  2. Comments about the lack of qualitative measures and subjectivity have been captured in WCAG 3 parking lot considerations from 2.x. I have also specifically called out some of @OwenEdwards' comments on technical solutions.
  3. The side conversation about lowering the sound track to add descriptions has been captured in a new audio ducking issue
  4. The side conversation on whether a video with no breaks in the dialogue can meet Audio Descriptions is currently being worked on in https://github.com/w3c/wcag/pull/1790