Image descriptions for composite or montage images in fixed layout EPUB.

Hi DAISY

I work with lots of illustrated publishers who are creating highly designed books for print. Putting aside the debate as to whether they should make fixed layout EPUB… I have a question about how to best advise them on adding alt tags and extended image descriptions when they use composites and montages of images.

Very often a designer will design one full bleed image for the entire background of the spread (so one image across two HXTML pages in FXL EPUB). There are good reasons for this - they want to use professional Photoshop features to add effects and filters that are not available in the page layout apps. They want to merge images together using opacity and blend modes. They want to tie images together, each spread can be a work of art. It simplifies their workflow and separates editorial work from design.

What is the best advice to give them to help with accessibility when faced with a single image that is placed on the spread. See some examples attached below.

As I can see it the options are:

Use no image descriptions, set as background images
Add one description that is used to describe all the imagery on the spread (where would that be placed in the reading order?)
Break up the large images into smaller tiled images, just so we can describe them in sections and drop into the reading order at the appropriate position
Add extra invisible objects into the reading order that are only there to hold the descriptions?

Does anyone else have any other options that they can see?

Happy to hear of any experience, suggestions or advice!

Supersports

Hi @CircularKen,

There's rarely ever a one size fits all answer to these kinds of problems. So long as the same information is available and included at a logical point in the reading order, whichever specific method you use is generally not an issue. Ideally, the method will accommodate the greatest number of users (i.e., not be targeted only to users of assistive technologies), but the practical limitations of publishing often get in the way.

That said, setting the images as backgrounds using CSS is almost always the worst approach, unless there really is no information the reader needs. Even if you can find another way to describe the purpose of the images and get AT to read them, it's probably going to be confusing.

If I were in your position, I'd look at the tiling option. If you can do it, it provides the greatest ability to insert the images and descriptions in their most appropriate location in the dom, as I think all these sub-images could benefit from descriptions. In the case of the first sample, you might also want to look at creating a single extra-wide spread to avoid the headache of a split page (e.g., using the page-spread-center fxl property). You can always wire up hidden descriptions using the aria-describedby attribute if it's not possible to provide a visible option.

Image maps are sometimes suggested for complex images, but I don't think they'd work well in these cases. Maybe only for the first example, but even then it wouldn't be ideal.

The W3C publishing community group is supposed to be forming a task force to look at these kinds of issues with fixed layouts. I'm curious myself to see what recommendations they come up with, as to be perfectly honest fixed layouts are not something I cross paths with very often.

The knock on them, of course, is that they can be hard to impossible to make broadly accessible per WCAG requirements. It's not that you shouldn't make them, and I'm sure there are cases where they can be done reasonably well, but when you have a legislative requirement to be accessible there are times we just can't offer solutions to the more intractable problems. Our goal is to try and make whatever publishers produce accessible, though.

Any other thoughts on this @clapierre @GeorgeKerscher @avneeshsingh ?

I am not sure how practically applicable the guidance at DIAGRAM Center is, but they have a couple of complex examples:

http://diagramcenter.org/specific-guidelines-c.html#32

http://diagramcenter.org/specific-guidelines-c.html#33

Thanks @danielweck That information would be helpful on creating the descriptions for these complex images. Now we just need to figure out the best way to encode this in an EPUB. @sh0ji May have some ideas as well here as W.W.Norton has been working through some of these issues for their own publications.

I agree with what @mattgarrish said above and I think that one small alt-text description summarizing the entire image with an extended image description either at the end of the chapter or book would be the ideal solution where the various parts of the image description would include their own sections in the extended description. If we broke it up into multiple separate images this could work too but might make the overall reading experience a little fragmented since the idea is to combine multiple images into one full image.

Thanks for cluing me into this, @clapierre, it's an interesting question! We don't generally do fixed layout EPUB but we do have quite a few infographics like these, which have basically the same description problems.

I agree with @mattgarrish that the ideal solution is one where you don't have to duplicate anything—the same content in the same order for everyone. Tiling feels like the best way to do that to me as well, but that leaves a lot of room for interpretation/implementation.

I've not really found a single solution to this problem. It's more a set of techniques that you can use for different compositional methods. And the three examples you've given demonstrate a few different ways images and text are composed together.

"Supersports" uses text to annotate points of interest in one image.
- This requires a scene description + descriptions for each point of interest in the scene.
"Good luck" is just 8 images with captions. Their composition doesn't convey any meaning.
- This doesn't require a scene description.
"Bug parents" uses a combination of the two techniques:
- The first section is 6 images with captions
- "Cicada life cycle" is an annotated flow chart

The one commonality seems to be the existence of a heading and introductory text, which I would expect to come first in the reading order for all three. If a scene description is necessary, that would come next, and then a description of the points of interest, possibly in a named list that makes it more apparent that they're related to the scene.

Supersports

There's a lot of information conveyed through proximity and I'd generally call the the type of textual information annotations on points of interest in the scene. Here's what I would require, in order:

A heading and introductory text.
A short description of the entire scene, mentioning but not detailing any of the specific points of interest that are annotated.
A description for each of the points of interest, associated with the textual annotation through some mechanism.
- Some of the points of interest are related and should be adjacent to each other (archer + bullseye), but order isn't otherwise conveying meaning.

A fragment of it in HTML might look like this, which could all be laid out over the image:

<h1>
    <div>Supersports</div>
    <div>Record-smashing science</div>
</h1>
<p>With hi-tech equipment sports stars of the future...</p>
<img src="scene.jpg" alt="A futuristic stadium shows archers, runners, and swimmers competing with hi-tech equipment" />
<!-- example 1: use a figure + figcaption to associate the annotation & image -->
<figure>
    <img
        src="archer.jpg"
        alt="An archer stands on a floating platform with his bow drawn. His bow arm is enhanced with an exoskeleton."
    />
    <figcaption>An exoskeleton arm sleeve could help this archer fire with extra strength.</figcaption>
</figure>
<!-- example 2: use aria-describedby to associate the annotation with the image -->
<div>
    <img
        src="bullseye.jpg"
        alt="A bullseye is held aloft by a drone. A chat bubble that reads, 'BULLSEYE!' points to the center."
        aria-describedby="bullseye-desc"
    />
    <!-- aria-hidden to avoid double-speak (this is just my preference) -->
    <div id="bullseye-desc" aria-hidden="true">Drones could hold targets in the sky.</figcaption>
</div>
<!-- there are probably other techniques for this... -->

Good luck

This is the simplest of the three. I might do something like this:

<h1>Good Luck</h1>
<p>Traditionally people have looked to nature for good omens...</p>
<!-- (named list not required but helpful) -->
<ul aria-label="Good omens">
    <!-- example 1: image + text grouped by the li -->
    <li>
        <img alt="An illustration of a white bird flying into an open window" />
        If a bird flies into your house, then it will bring good luck with it...
    </li>
    <!-- example 2: figure -->
    <li>
        <figure>
            <img alt="An illustration of a weasel sitting on a roof" />
            <figcaption>
                In Germany to see a weasel sitting on the roof of a house is good luck.
            </figcaption>
        </figure>
    </li>
    <!-- there are probably other techniques for this... -->
</ul>

Thanks for your comments everyone.

Will it be best practice to produce an alternative textual output alongside the fixed layout EPUB?

The general preference here seems to be for tiled images, but unless they are to be used as part of an alternative textual output I can't see a need or a use for the tiled images. In making a single FXL EPUB file as accessible as we can, wouldn't it therefore be acceptable for the descriptions to be added to extra invisible objects into the reading order that are only there to hold the descriptions and their locations?

This is my preference as it would be easier to create and cause less file processing, complexity and file size. It would also make it easier to add accessibility to already existing FXL EPUBs.

Something like:

<h1>
    <div>Supersports</div>
    <div>Record-smashing science</div>
</h1>
<p>With hi-tech equipment sports stars of the future...</p>
<img src="scene.jpg" alt="A futuristic stadium shows archers, runners, and swimmers competing with hi-tech equipment" />

<figure>
    <img
        src="invisible.png"
        alt="An archer stands on a floating platform with his bow drawn. His bow arm is enhanced with an exoskeleton."
    />
    <figcaption>An exoskeleton arm sleeve could help this archer fire with extra strength.</figcaption>
</figure>

daisy / kb

Image descriptions for composite or montage images in fixed layout EPUB. #35

Supersports

Good luck