veraPDF / veraPDF-wcag-algs

GNU General Public License v3.0
5 stars 4 forks source link

WCAG 4.1.2 test 15 false positive #131

Open jzuidweg opened 2 years ago

jzuidweg commented 2 years ago

This document still fails WCAG 4.1.2 test 15 ("Paragraph is incorrectly tagged as a numbered heading") on page 34 for the line "4.2.1 Binnensportaccomodatie". This is clearly a heading, so this is a false positive.

Possibly the algorithm notes that the format of this line is the same as that of the heading directly above "4.2. Analyse accommodatiebestand in relatie tot de ontwikkelingen", but it is still obviously a heading because of the formatting and because it continues the numbering.

Can you look into this case?

bdoubrov commented 2 years ago

We already take into account the presence of the numbering and the bold font. It is just not enough in this case to be classified as a heading, since the preceding and the following blocks of text are also bold and have the identical font size.

We can still use the fact that the block below is bold+italic, so in fact has a different formatting.

bdoubrov commented 2 years ago

Another important marker is that "4.2.1 Binnensportaccomodatie" does appear in the TOC. With the new TOC support we can recognize this and add as an extra criterium for heading detection.

jzuidweg commented 1 year ago

The algorithm still appears to miss headers which are clearly headers to the human eye. In particular:

I'm therefore not closing this issue yet.

MaximPlusov commented 1 year ago

@jzuidweg The elements 4. Ontwikkelingsrichtingen periode 2021-2030 (on page 33) and 4.2.3 Gebruikstarieven (on page 37) are tagged as paragraphs in the document and we are correctly detect that this should be tagged as headings.