w3c / epubcheck

The conformance checker for EPUB publications
https://www.w3.org/publishing/epubcheck/
BSD 3-Clause "New" or "Revised" License
1.66k stars 405 forks source link

Check proper usage of `alt` attributes on `img` and `area` elements #446

Open rdeltour opened 10 years ago

rdeltour commented 10 years ago

Lack of an alt attribute is currently naïvely reported as a USAGE-level message (ACC_001). However, HTML5 rules are more complex than just "alt is required".

See 4.7.1.1.22 Guidance for conformance checkers for alt attributes on img elements. For area elements, alt is required only if the area represents a hyperlink.

tofi86 commented 7 years ago

For area elements, alt is required only if the area represents a hyperlink.

This is already checked by the HTML 5 schema when a href attribute is present:

ERROR(RSC-005): .test.epub/EPUB/lorem.xhtml(49,27): Error while parsing file 'element "area" missing required attribute "alt"'.

Therefore we don't need to check it with the HTMLTagsAnalyseHandler. But if we want to do so, it should be easy enough:

if (nonTextTagsAlt.contains(tagName))
{
  String altAttribute = attributes.getValue("alt");
  if("area".compareTo(tagName) == 0) {
    String hrefAttribute = attributes.getValue("href");
    // #446 only report empty/missing alt attribute on area element when a href attribut present
    // https://www.w3.org/TR/html5/embedded-content-0.html#the-area-element
    if (null != this.getFileName() && null == altAttribute && null != hrefAttribute)
    {
      report.message(MessageId.ACC_001, EPUBLocation.create(this.getFileName(), locator.getLineNumber(), locator.getColumnNumber(), tagName));
    }
  }
  …
}
tofi86 commented 7 years ago

See 4.7.1.1.22 Guidance for conformance checkers for alt attributes on img elements.

This seems to be not doable in the HTMLTagsAnalyseHandler. We can query inFigure and we can add another flag inFigcaption and check for figcaption content in characters() method and then report an empty or missing figcaption in the endElement() method, but then we would also have to set another flag for imgHasNoAlt to be able to report the missing img/@alt attribute when handling the closing figcaption tag. It may work, but I'd rather search for another solution (like schematron) than implement it in the HTMLTagsAnalyseHandler.

tofi86 commented 7 years ago

BTW: the HTMLTagsAnalyseHandler and all the messages inside (like ACC-001) are not processed when doing single file xhtml validation. Is this the desired behaviour?

tofi86 commented 7 years ago

Clearing assignee and removing from 4.1.0 milestone. Feel free to work on it and assign it back to 4.1.0 as long as we're still working on it.

apgrover commented 4 years ago

Hi there! As of v4.2.3 we started noticing the following error (when we previously hadn't in 4.2.2 and recent prior versions):

Error while parsing file: element "img" missing required attribute "alt"

We wanted to double-check that this should now be the expectation (especially since it looks to be part of https://github.com/w3c/epubcheck/commit/22fa3b1). Thanks!

rdeltour commented 3 years ago

I updated the schemas in #1211, to be released in v4.2.5.

Note that a missing alt attribute is usually an error on img elements, except in some very specific cases.

EPUBCheck is still not checking the full logic defined in the spec, but the newer schemas should at least prevent any false-negative. We should probably wait until a more complete integration of the Nu HTML Checker before implementing anything else. I'll keep this issue open in the mean time.