Open saizai opened 3 years ago
Content-Text should contain that text, following the specifications in WHATWG HTML 5 § 4.8.4.4.
Editing fail. "That" refers to the immediately following "if" block.
This should also be connected with https://github.com/whatwg/html/issues/5890 which proposes to enable access to XMP (the standard for embedded asset metadata) by browsers/. It evens mentions previously efforts by the W3C to address the problem.
@lrosenthol Thanks for the cross link. I wasn't aware of XMP. Seems like a very good overlap, and https://www.w3.org/html/wg/wiki/Metadata specifically mentions the accessibility use case.
@dwsinger wrote on the w3c email thread:
You should probably be aware that we recently amended the base HEIF format (which is not tied to HEVC, and indeed lays under AVIF) to allow for intrinsic alt text(s) (possibly plural in multiple languages), as this enables the image creator to make that text at the time of file creation, and for it to travel automatically with the image.
Just for clarity -- HTTP does not use MIME, so updating those specs won't have any effect, and isn't necessary. All you need is a small spec defining a brand new HTTP header field. And, of course, people to implement it.
If you want to do that in the HTTP Working Group it should be easy to get it going. If not, please loop folks there in on reviews. For example, it's probably a good idea to make it a structured field since it's new, and you should look through this text too.
Recipient list
WHATWG: html, html-aam, html-aria W3C WGs: html, apa, aria, webapps IETF WGs: httpbis, 822ext
CC authors of (current) prior RFCs: Nathaniel Borenstein, Steve Dorner, Ned Freed, Ed Levinson, Keith Moore, Julian Reschke, Rens Troost
Cross-posted by email to W3C & IETF groups, and by GitHub to WHATWG at:
The content is equivalent, modulo small formatting changes for Markdown vs email, and addition of section deeplinks in Markdown version.
Background
Objective
Humans with disabilities, and machines, should have fully equal access to the textual content of image and other files.
Problems with the current specs
Relevant prior RFCs
HTTP/1.1 Content-Disposition header & Content-Description field
RFC 2616 [obsolete] Hypertext Transfer Protocol — HTTP/1.1
RFC 7231 [current, no updates] Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
RFC 1806 [obsolete] Communicating Presentation Information in Internet Messages: The Content-Disposition Header
RFC 2183 [current, no relevant updates] Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field
RFC 6266 [current, no updates] Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)
HTML
RFC 1866 [obsolete] Hypertext Markup Language - 2.0
RFC 2854 [current, informational] The 'text/html' Media Type
HTML 4.01
HTML 5
HTML Accessibility API Mappings (AAM)
MIME Content-Description header
RFC 1341 [obsolete] MIME (Multipurpose Internet Mail Extensions)
RFC 1521 [obsolete] MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies
RFC 2045 [current, no relevant updates] Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies
RFC 1872 [obsolete] The MIME Multipart/Related Content-type
RFC 2112 [obsolete] The MIME Multipart/Related Content-type
RFC 2387 [current] The MIME Multipart/Related Content-type
EXIF
Discussion
It could put this metadata in its HTTP headers. However, this information is either not transmitted, or not used.
Some image formats provide for the necessary metadata. However, these are rarely used — typically, it's stored separately — and not all image formats have this support.
Consider the various dependency/test status images used on GitHub.
Only the remote image server knows, at display time, what the image represents. This is because it runs test suites on the most recent version of the codebase, checks the current status of servers, monitors third-party published vulnerabilities or library updates, etc.
Example: https://github.com/atom/atom
The first 3 images in the README section are: a. Azure Pipelines build/test/integration status b. David Dependency Manager dependencies update status c. Heroku/Slack server status [this image currently doesn't load]
The correct alt text for these, at time of writing, should be: a. Azure Pipelines succeeded b. dependencies up to date c. Heroku is offline for maintenance
It's impossible for the author of README.md, or GitHub itself, to know any of this before the user agent actually fetches the image.
As a result, people using a screen reader get zero information from these images, whereas sighted users know the live statuses .
(Pedantic caveat: actually, GitHub runs a caching proxy server on such images; they aren't fetched by the user agent directly from the authoritative server. However, this is functionally transparent.)
Content-Description is defined as "some descriptive information" (RFC 2045 § 8). All examples in the RFCs are either narrative, e.g. "just a small picture of me" (RFC 2183 § 3), or useless, e.g. "jpeg-1" (id.).
By contrast, ALT text is meant to be the nearest equivalent — which, in the case of simple images of short text, is the verbatim text.
HTML 4.01 (§ 13.8) describes it as "alternate text to serve as content when the element cannot be rendered normally".
HTML 5 describes it as "equivalent content for those who cannot process images or who have image loading disabled (i.e. it is the img element's fallback content)" (§ 4.8.3). It "should never contain text that could be considered the image's caption, title, or legend. It is supposed to contain replacement text that could be used by users instead of the image; it is not meant to supplement the image" (§ 4.8.4.4.1).
AFAICT, there is no equivalent field in either HTTP or MIME. There could and should be.
Proposals
Mime — Content-Text
Update RFC 2045 to add the header Content-Text, defined as follows.
Content-Text should contain that text, following the specifications in WHATWG HTML 5 § 4.8.4.4.
All files should include this header if:
HTTP — Content-Disposition
Update RFC 2183 and RFC 6266 to change the Content-Disposition header as follows:
HTML-AAM — IMG and INPUT type=image
Insert the following before the "none of the above" option in the HTML-AAM accessible name computation instructions:
When an ALT or TITLE attribute is not available, use the first available of the following:
HTML — no change
There is deliberately no change proposed to the HTML spec itself.
The purpose of this proposal is to address situations where the HTML author does not, or cannot, add the relevant information. Therefore, the changes are to user agent behavior, and to the data accessible to user agents from sources other than the HTML, i.e. server and file headers.
Intellectual property release
All original IP in this proposal is owned jointly by Sai and Fiat Fiendum.
We freely license it as follows:
Sincerely, Sai President, Fiat Fiendum, Inc., a 501(c)(3)
PS Non-gendered pronouns please. I'm a US citizen.