w3c / wcag

Web Content Accessibility Guidelines
https://w3c.github.io/wcag/guidelines/22/
Other
1.09k stars 246 forks source link

Definition of video-only and synchronized media #1908

Open ebalink opened 3 years ago

ebalink commented 3 years ago

Video criteria rely on interpretation of whether a media content in question is video-only or synchronized.

If a (prerecorded) video is regarded as a video-only content, a manuscript (an alternative for time-based media) would be sufficient solution (criterion 1.2.1 ) for AA level. But if the video is seen including an audio content, it's regarded as a synchronized media and the content needs to include both captions and audio description.

The definition of video-only is quite brief and literally means that having any kind of audio track makes the content as synchronized media and audio descriptions need to be added for all visual content in the video. However, in practice many (e.g. promotional) videos present the actual information in a video form only and their audio track contains merely some background music. Of course, background music can provide relevant information (an example being a movie scene where music defines the atmosphere - e.g. funny, intense, scary...) but there are also cases where background music doesn't provide any actual information. In these cases adding an audio description means that the audio description merely partly overpowers the original audio content which was the only factor defining the need for audio description. Even though the video and audio contents are synchronization in time in a technical sense (by file format specs) the actual information they provide are not. There is no change in the information and the user might not even experience any feeling of unsynchronization from the media even if the timing of one of the tracks is changed a few seconds in either direction.

In those cases it's quite uncertain that adding an audio description and providing information of the video in time-based format only produces more accessible solution than an alternative for time-based media.

The definition of video-only has been specified in the understanding of SC 1.2.1 with a phrase "An example of pre-recorded video with no audio information or user interaction is a silent movie. ". This description should be clarified and brought into the video-only definition where it can be easily found. (If I've understood correctly,) a common assumption is that a mere background music doesn't make video as synchronized media. Therefore I'm assuming that the existing reference to silent movies refers to an idea of videos having background music but no dialogue. (although, as mentioned above, a background music may provide significant information in silent movies)

I suggest that the definition of video-only will be updated specifying the term "no audio" as "no audio that provides relevant information to understand the presentation".

patrickhlauke commented 3 years ago

interesting/related discussion on WebAIM list a while ago on this topic https://webaim.org/discussion/mail_thread?thread=9802

ebalink commented 3 years ago

My opinion is close to Steve Green's, although from a point-of-view where the video is the one that provides the relevant information.

WCAG doesn't suggest we should stay with a fully accessible text-based content. Videos, audios, colors, visualizations etc. do make content more accessible to many people. Same goes to adding a background music to video content. It can make the content more approachable and pleasant for some people - in comparison with 5 or 10 minute videos with no voice at all. But it doesn't add any more information to the video. Therefore I'd think a comparison to a video-only content would have stronger grounds.

It should be noted that an audio description should also have a clear voice so adding an audio description afterwards might result a need to drop the original background music volume clearly. So in many cases the result would probably not be anywhere near an audio play or something where a background music and speech would exist in harmony. Should the background music volume therefore go up and down between the audio descriptions? If the video is e.g. promoting a city landscape or objects in a museum with lots of contents to describe it might be more worthwhile to just drop the music completely from the background instead. And then it would actually be just a video-only file with an additional audio description track.

I'd also like to point out that for a video-only content, a transcript or separately made audio recording (describing the visual content) might also be more suitable solution. Then users with no or limited vision would get the provided information in a speed optimized for the provided information and not based on presentation time the visual content is shown.

Of course it would be optimal if the service providers would think about captions and audio descriptions while they plan their video content but as WCAG has been implemented to legislation around the globe we'd also need to think about cases where the requirements are needed to satisfy afterwards.

mraccess77 commented 3 years ago

I'd add just because there is audio doesn't mean it actually synchronized with the video. Many times the audio may be deliberately shown to play alongside but it's not specifically synched. So having an updated note would help in these situations where a transcript might be more usable than adding in audio description to meet Level AA.

patrickhlauke commented 3 years ago

i may be wrong, but i suspect the original intent when the "synchronised" word was used was more a generic, but very loose, way of saying "happens at the same time/simultaneously", rather than a very specific "it synchs up with specific timecodes/actions"

ebalink commented 3 years ago

Based on the quite technical language it is used in WCAG I would assume the same as @patrickhlauke. My point, however, is that that criteria was suitable when the WCAG was used only from voluntary basis, as guidelines. It wasn't very critical if a service provider didn't follow the criterion, but at least the service providers and authors could use them as a good suggestion for what to aim for. Today, as the legal grounds are quite different and the usage of time-based media has increased and diversified, we need to more precise in what we actually require from service providers and authors in different situations. In my opinion we'd have good grounds on updating the interpretation to mean situations where the information of the video and audio track are synchronized and not just the technical "containers" of their binary data.

detlevhfischer commented 3 years ago

Just adding that there can be situations where an synched audio description would be helpful with video only content. Think of a video showing how a cardboard box is folded / put together. If you see well, the visuals may be sufficient, if you cannot see details clearly (like numbers on flaps or the like), the visuals will give a broad idea and the audio will help making sense of it all.

ebalink commented 3 years ago

I agree, @detlevhfischer. A good correction to my previous comment how a transcript might be more helpful than an audio description. Still (regarding the original topic): a background music isn't a relevant factor for the issue.

detlevhfischer commented 3 years ago

@ebalink I fully agree that it would be good to clarify that the common generic muzak in videos that carry information only in the visuals need not be considered synchronized audio even where it technically is.

bruce-usab commented 3 years ago

@ebalink , I will take the liberty to provide you the same sort of frank and candid guidance I would provide if this question was posed to me by someone from a U.S. Federal agency. (Answering these sort of technical assistance questions is part of my day job.)

Video criteria rely on interpretation of whether a media content in question is video-only or synchronized.

You are trying to parse the requirement in a way that is not really credible. You are asking, essentially, is a video-with-an-audio track video-only? The answer, of course, is no.

So then the question the question becomes: If the audio does not include any spoken dialog and is entirely non-informational, can I (at AA) just post a transcript?

The answer is clearly no, because there is nothing in the WCAG 2.x SC text which provides for that sort of exception.

At WCAG 2.x Level AA, the site owner has two options:

  1. Include a version with Audio Description.
  2. Remove the audio track from the video.

However, in practice many (e.g. promotional) videos present the actual information in a video form only and their audio track contains merely some background music. Of course, background music can provide relevant information (an example being a movie scene where music defines the atmosphere - e.g. funny, intense, scary...) but there are also cases where background music doesn't provide any actual information. In these cases adding an audio description means that the audio description merely partly overpowers the original audio content which was the only factor defining the need for audio description.

Audio description complements the default/original audio content. Your characterization of overpowers is not accurate. For example, please see the ACB The Audio Description Project page, including the linked example.

The definition of video-only has been specified in the understanding of SC 1.2.1 with a phrase "An example of pre-recorded video with no audio information or user interaction is a silent movie. ".

I would argue that a old-timey black-and-white (sepia) movie with a piano (only) audio track is not a silent movie!

This description should be clarified and brought into the video-only definition where it can be easily found.

Please suggest edits that would have clarified for you that 1.2.5 is applicable to videos (with sound) even when the audio track does not include spoken dialog (or other informative audio).

(If I've understood correctly,) a common assumption is that a mere background music doesn't make video as synchronized media. Therefore I'm assuming that the existing reference to silent movies refers to an idea of videos having background music but no dialogue. (although, as mentioned above, a background music may provide significant information in silent movies).

I understand people making that assumption. I do not agree it that it is a correct inference.

The tension is that many people (end-users) would prefer a good transcript over an audio-described version. At Level AA, WCAG 2.x does not allow for providing good transcript instead of audio description, not even in the case where the audio track does not contain dialog.

My recommendation is for you to ask that the owner remove the audio track from the video. Then the video really will be video-only, and a transcript would be sufficient for WCAG 2.x Level AA conformance.

My expectation is that such a recommendation will not be appealing. (E.g, But we put so much time into that background music!)

All the reasons why removing the audio track is not acceptable, are also reasons why you need a version with audio description!

ebalink commented 3 years ago

I have to disagree, @bruce-usab .

You are trying to parse the requirement in a way that is not really credible. You are asking, essentially, is a video-with-an-audio track video-only? The answer, of course, is no.

It depends on what we mean with "video". I'd think there are also good grounds for interpretation that in some cases in WCAG the term refers to the information not to the binary data format. And based on discussions in different accessibility forums over the years, I'm not the only one who's seen it this way.

WCAG also instructs that the audio description should be added to "existing pauses in dialogue". This guidance hints that a need for audio description is expected to exist in situations where the audio track is assume to contain only a dialogue or otherwise relevant information. At least it can be assumed that the criterion was made for very different selection of video content (than today) in mind so it's legit to open it up for interpretation. Same thing as it is with other legislation.

In the definition of captions, captions are needed for "audio information needed to understand the program content". The example mentions music, but e.g. in the non-text content criterion it's said that a sufficient text alternative for a content providing a sensory experience would be a description that merely identifies the possible music in the media. It's also suggested that a non-text content that has purely a decorative purpose should not be described to the text alternative. In this case the background music would "serve only an aesthetic purpose, providing no information, and having no functionality" - being equal to what the definition of pure decoration states. So in many cases, evaluation of the relevant information in the content is something that WCAG instructs us to do.

Audio description complements the default/original audio content. Your characterization of overpowers is not accurate. For example, please see the ACB The Audio Description Project page, including the linked example.

I am familiar with movies with audio descriptions. I don't question the need nor the suitability of audio description in these. My question was about situations where the whole audio track is only a background music that doesn't provide any information by itself nor in combined to the information of the video track.
Argument that audio description wouldn't considerably disturb information in the audio track in cases of movies and their foley sounds doesn't apply in situations where the audio track consist of music. If we regard that music as the only relevant information in the audio track then it's quite clear that adding audio descriptions of the visuals would disturb that auditory experience.

(at this point it should be noted, that fulfilling this requirement will be a whole lot easier in the future when we'll have larger selection of web video players enabling additional audio description track or an other options to toggle between enabling and disabling an audio description. But currently that is not possible and the audio description requirement in these cases is met by creating an alternative version of the video including the audio description on the top of audio track - which results the problems I mentioned)

My suggestion is that the definition of synchronized media in WCAG is updated e.g. by adding a note: "Audio and video tracks and other content in media are regarded as being a part of synchronized media only if they contain information that is needed to understand the media content or have a temporal connection to information provided in an other content or content track." (or something similar)

I myself am a senior officer in the web accessibility supervision of Finland. Our aim is, of course, to improve the accessibility for web services in public, private and third sector, but at the same time we need it to be clear what it is what the law requires from the service providers and what they need to do to their pre-existing content. (our legislation refers to EN 301 549 v. 2.1.2 standard which refers to WCAG 2.1)

It's clear that WCAG will not and cannot keep up with the fast evolution of web and other ICT services. So, in many cases we need to do the case-by-case evalution "in the spirit of WCAG" to end up at least somewhat accessible solution. Providing a guidance which leaves it up for interpretation what is actually relevant information results a better outcome than an attempt to create guidance with more strict boundaries (assuming that the latter would exclude more content out from the scope of WCAG).

I do understand that some might see my suggestion as an attempt to create a loopholes to the WCAG but that is not my intention. If the audio track does contain any relevant information at any point, the content would be regarded as synchronized media and it would require captions and audio description. But if it doesn't, adding audio description (instead of providing a transcript) isn't an improvement to the content's accessibility.

mraccess77 commented 3 years ago

The definitions of video only and audio only are:

video-only a time-based presentation that contains only video (no audio and no interaction)

audio-only a time-based presentation that contains only audio (no video and no interaction)

https://www.w3.org/WAI/WCAG21/Understanding/audio-only-and-video-only-prerecorded.html#dfn-audio

Which means anything with both is not in the "only category". Which then means for SC 1.2.2-1.2.7+ we need to look at the definition of synchronized media: synchronized media audio or video synchronized with another format for presenting information and/or with time-based interactive components, unless the media is a media alternative for text that is clearly labeled as such (https://www.w3.org/TR/UNDERSTANDING-WCAG20/media-equiv-captions.html#synchronizedmediadef)

Which is then up to how you interpret "presenting information". If it's taken broadly then a music track with video is synchronized. If taking more narrowly it could be different.

Unfortunately a transcript could actually be more accessible than audio description in some situations yet this is not required at all by AA. The best way to solve the need for transcripts is to add a new criteria requiring transcripts for synchronized media at A/AA.

bruce-usab commented 3 years ago

I have to disagree, @bruce-usab .

@ebalink you are putting a great deal of effort into rationalizing why a plain reading of the SC text does not say what it says.

WCAG 2.x SC 1.4.5 requires that a video with a sound track is available in a version that includes audio description.

Again, please suggest edits (probably to the Understand document, since the normative text seems okay) that would have clarified for you that 1.2.5 is applicable to videos (with sound) even when the audio track does not include spoken dialog (or other informative audio).

I have no doubt that others have gone down this particular rabbit hole. Can you and I work together on improving the advisory materials?

But if it doesn't, adding audio description (instead of providing a transcript) isn't an improvement to the content's accessibility.

Finland, and anyone else tuned into accessibility enough to realize that a good transcript (i.e., something close to a screenplay) is better than audio description can, of course, provide the transcript in addition to the audio described version. It is not like WCAG 2.1 Level AA precludes sites from including transcripts!

My question was about situations where the whole audio track is only a background music that doesn't provide any information by itself nor in combined to the information of the video track.

WCAG 2.x does try and parse content that finely. A video having a soundtrack is yes/no objective question. A conclusion that a non-dialog soundtrack is of no informational value is much more subjective.

My suggestion is that the definition of synchronized media in WCAG is updated e.g. by adding a note...

The ink is not quite yet dry on 2.2, so I will try and format this as a PR. FWIW, I do not expect such a suggestion to get traction. OTOH, I have been wrong before!

If we regard that music as the only relevant information in the audio track then it's quite clear that adding audio descriptions of the visuals would disturb that auditory experience. [emphasis added]

This is not correct because properly done audio description integrates with the on-screen visual experience. Audio description has a good deal of art and theater to it. You cannot use bad audio description to demonstrated that audio description is not useful.

From audio description [emphasis added]:

narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone
In standard audio description, narration is added during existing pauses in dialogue
Audio description of video provides information about actions, characters, scene changes, on-screen text, and other visual content.

Having the option for a version of a video which includes narration on top of the background music is a feature which improves accessibility. (Again, the point that maybe a transcript would provide better accessibility is not relevant to the requirement from 1.4.5 that audio description be provided for videos with sound.)

ebalink commented 3 years ago

How I see this, @bruce-usab, is that both your and my interpretations of the WCAG terms in this question can be accurate. It is up for debate what is the actual scope of the term "audio" in this context. You and I have both given a good examples to back up both interpretations. As ICT services and the use of video content in web have evolved so much and the guidelines are a part of normative legislation around the globe it is crucial that this part of the guidelines is clarified - but with a re-evaluation and not just through a secretarial work that relies on dictionary definition.

We are not talking only about decisions how we instruct different governmental or municipality authorities to provide information on their web sites. If only that would be the case, it would not be problem for governments to decide what ever policies they want for their communication units. But we also talk about legal consequences for third and private sector organizations if they don't follow these rules. So with web accessibility legislation we interfere their autonomy on deciding about their own internal and external communication and marketing policies and allocation of resources. In many cases visual and audio content is used for marketing purposes to create a desired atmosphere only. That part of a content might be very crucial for the authors objectives even though not necessarily commutable to an accessible format.

A good audio description has "a good deal of art and theater to it", as you said, and would not be problem for professional movie and TV companies, but would be an unreasonable requirement for the majority of parties that produce video content to the web.

Therefore the legislators and authorities that do the interpretations need to be very cautious that the requirements are reasonable in relation to their hoped effects on the human rights and common good etc. In my opinion, relying on your very technical point of view about the term, does help to make the web accessibility legislation more precise and enables evaluating it with automatic tools - but also extends the accessibility requirements beyond their proper scope. From service providers and authors it would require lots of work that don't actually result noteworthy improvements on the content accessibility. The web accessibility requirements should help to make the web more equal and accessible without obstructing innovations or progress or causing service providers or authors to remove content from the web or reduce producing it. And I think the strict interpretation you suggest would result that and it would create a barrier for other organizations to voluntarily apply WCAG.

As WCAG is permessive also on CAPTCHA it should be very careful so it would still stay as a suitable reference for different legislation so that there would be a less need for adding exceptions into the legal documents.

Therefore I don't suggest updating the guidelines in a way that it would define synchronous media as a mere combination of video and audio track. And therefore the more permissive update should be included in the normative parts (and not to the understanding parts).

One solution would be the update the A- and AA-level requirements (1.2.2, 1.2.3, 1.2.4 and 1.2.5) to this more permissive format (relying e.g. to a term "relevant audio information") and to add a new AAA-level criterion (1.2.10) that would require captions and audio descriptions when a content includes any visual and audio content, e.g. tracks.

(sorry for a long comment)

ShadowBB commented 2 years ago

I agree with @ebalink (And Steve Green) and initially interpreted the definition of "synchronized media" ("audio or video synchronized with another format for presenting information...") in such a way that merely "decorative audio" provided no synchronized information and thus was "video-only". The "video-only" definition of "a time-based presentation that contains only video (no audio and no interaction)" I always interpreted that the decorative audio wasn't "contained in the presentation" because it was not presenting information. This could be caused by English not being my first language.

I clearly saw the use for audio description being synchronized with the rest of the informative audio but saw a text transcript as a far more superior solution if no informative audio was available to synchronize the audio description to.

In the same vein I always saw closed captions being synchronized to informative video but saw a text transcript as a far more superior solution if no informative video was available to synchronize the closed captions to.

I admit that I thought this was caused by an oversight in the definitions of video-only and audio-only when compared to the definition of "synchronized media" because otherwise there would have been a gap where something wasn't any of the 3 and no SC would have been relevant!

I am not an expert on what would help end users the most but I think there are situations where if we use the interpretation of @bruce-usab there are scenario's where WCAG is advocating for a suboptimal solution for some users instead of merely removing barriers to information.

My suggestion would be to add a note to the definition of synchronised media that clarifies that adding "decorative audio" or "decorative video" doesn't make it synchronized media because that is not presenting information. I would also suggest adding a note to "audio-only" and "video-only" to make sure those definitions cover anything that isn't synchronized media.