AndyClifton / accessibility

A CTAN-compliant version of the LaTeX `accessibility` package
39 stars 6 forks source link

Page without StructParents, syntax problem #34

Open viktoriasee opened 4 years ago

viktoriasee commented 4 years ago

Steps to reproduce

Run this minimal example either in pdftex or lualatex:

\documentclass{scrreprt}
\usepackage{luatex85}
\usepackage[tagged]{accessibility}

\begin{document}
Text.
\end{document}

Check the output generated in PAC version 3.0.7.0 pac3-latex-accessibility-minimal_crop

You get an error page without StructParents.

Expected behaviour (correct)

The StructParents entry should be there.

AndyClifton commented 4 years ago

@viktoriasee I've invited you to join the project as a collaborator as you seem to have time to spend on it, and I'd appreciate some help! This might be easier than using forks and pull requests.

viktoriasee commented 4 years ago

I feel honoured, thanks. I indeed have some time but I am not a programmer so I need help.

\documentclass{scrreprt}
\usepackage{tagpdf}

\tagpdfsetup{activate-all}

\begin{document}
Text.
\end{document}

in pdftex creates the StructParent. Is this a hint? Maybe we should bring Ulrike Fischer on board. structparent-minimal

viktoriasee commented 4 years ago

A pdf without the error like above will contain something like

<<
/Type /Page
/Contents 17 0 R
/Resources 16 0 R
/MediaBox [0 0 612 792]
/StructParents 0/Tabs/S
/Parent 21 0 R
>>

A PDF as it's produced by accessibility right now looks like this:

<<
/Type /Page
/Contents 17 0 R
/Resources 16 0 R
/MediaBox [0 0 612 792]
/Parent 21 0 R
>>

One can use tagpdf with parameter uncompress to create a human readable pdf.

viktoriasee commented 4 years ago

I've learned from the reference p.147 that Structparents for page objects are mandatory for a tagged PDF. They may be needed for other objects such as images too.

AndyClifton commented 4 years ago

Source of error

It looks like the general PDF object is written to PDF in accessibility.sty in lines 568 to 575:

\immediate \pdfobj useobjnum \theStructTree{%
    <</Type /StructTreeRoot %
        /RoleMap \theObjHelp \space 0 R %
        /ClassMap \theClassMap \space 0 R %
        /ParentTree <</Nums [0 [\Karray]]>> % TODO Viel komplizierter
        /ParentTreeNextKey 1 % berechnen
        /K [\Karray] %
    >>}\pdfrefobj\pdflastobj%

(and line 1032 to 1039 of the .dtx file, which is where the changes will need to be made to propagate correctly; changing the .sty file in tests/article is fine for testing)

mitigation

If I understand this right, it means that if /StructTreeRoot is page, then we need to add /StructParents 0/Tabs/S, where the value is ..

the integer key of the page's entry in the structural parent tree

And that value is defined / described in "finding structural elements from content items" on page 868 of the manual.

How to proceed

Suggested approximate steps to correct this:

N.B. I think I understand why this was left as a "TODO"....

viktoriasee commented 4 years ago

Is it really that complicated? When I add \pdfpageattr{/StructParents 0/Tabs/S} to my document preamble the error is gone.

AndyClifton commented 4 years ago

Ok, this could be a solution.

Could you extend the MWE with a page break and see if this fix still works, please?

viktoriasee commented 4 years ago

The mwe has a page break and it works: https://github.com/AndyClifton/accessibility/blob/master/tests/article/minimal-pdftex.tex

AndyClifton commented 4 years ago

@viktoriasee, when you have chance, could you try one thing for me, please?

Try adding the option tagged and either flatstructure or highstructure to the call to accessibility, i.e.,

\usepackage[tagged, flatstructure]{accessibility}

and see if that changes anything?

viktoriasee commented 4 years ago

tagged was there already. And both highstructure or flatstructure do not make a difference. But \pdfpageattr{/StructParents 0/Tabs/S} does.