Closed DavidEGx closed 2 weeks ago
a portable version (which works with lualatex which is generally preferred for tagging) is
\DocumentMetadata{
lang = en,
pdfversion = 2.0,
pdfstandard = ua-2,
testphase = {phase-III, title, table, math, firstaid}
}
\documentclass[10pt,a4paper,notitlepage,twoside,openright]{report}
\usepackage{pdfpages}
\begin{document}
\includepdf[pages={-},nup=1x1,frame=true]{example-image-a4-numbered.pdf}
\end{document}
it does seem to loop with xelatex on the first run (it works with xelatex if luatex has written an aux file previously)
you shouldn't use xelatex for tagging. It can't handle real space chars properly. Use lualatex. But beside this, I can reproduce the bug too, something probably goes wrong in writing the position of the graphic. But if works with includegraphics, so I wonder what pdfpages is doing here.
I fixed the bug in tagpdf which lead to the loop.
But beside \includepdf
is problematic. At first as it calls \includegraphics
more than once and so e.g. a simple \includepdf{example-image.pdf}
leads to seven figure structures:
I can get rid of one of them by adapting the page count command:
\usepackage{l3graphics}
\ExplSyntaxOn
\makeatletter
\def\AM@getpagecount{\graphics_get_pagecount:nN{\AM@currentdocname}\AM@pagecount}
\ExplSyntaxOff
But for the other the help of the pdfpages maintainer is needed ...
The second problem is that if you include a larger document with text in it, you should consider what that means for accessibility: such a document has no structure, it is only a number of larger pictures.
Thanks for the answers and the work.
I have a nice pile of legacy code that generate PDFs in a myriad of ways. Dunno if replacing xelatex with lualatex is feasible.
Not all the generated PDFs includepdf
so we'd be making some progress here. Even if we don't make it all accessible in the first go, we can iterate later. What we cannot have is servers melting down 😁.
Anyway, couple of questions:
I have a nice pile of legacy code that generate PDFs in a myriad of ways. Dunno if replacing xelatex with lualatex is feasible.
Try it out. Normally it should not be a problem, and xelatex is really not suited for tagging.
This here e.g. is some simple text with real space chars in lualatex:
and here the same with xelatex:
I will try to make a tagpdf update this week.
I used adobe pro to check the tags. You can also use pdf Xchange, or the newest pac 2024: https://pac.pdf-accessibility.org/de/herunterladen.
I uploaded the fix to ctan.
I have a nice pile of legacy code that generate PDFs in a myriad of ways. Dunno if replacing xelatex with lualatex is feasible.
Try it out. Normally it should not be a problem, and xelatex is really not suited for tagging.
Thanks. I'll give it a go.
I used adobe pro to check the tags. You can also use pdf Xchange, or the newest pac 2024: https://pac.pdf-accessibility.org/de/herunterladen.
I was thinking on Linux. PAC 2024 seems to work ok(ish) with wine
$ WINEPREFIX=~/.wine32 winetricks dotnet48
$ WINEPREFIX=~/.wine32 wine PAC.exe
However, I get "PDF Header not found" when opening a PDF generated by myself:
Changed pdfversion
to 1.7, generated the file again and the looks good:
Tried PAC 2024.2.1 BETA and now the 2.0 opens:
Then it looks fine.
I wonder If I should stick to 1.7 because it is more widely supported or go to 2.0 because it introduces enhancements for accessibility. :thinking:
Tried PAC 2024.2.1 BETA and now the 2.0 opens:
yes after some pushing they now just started to add support for PDF 2.0. (But they do not get all tests correct.)
I wonder If I should stick to 1.7 because it is more widely supported or go to 2.0 because it introduces enhancements for accessibility
We push and promote 2.0 as it is really needed if you have math in your document (and also for some other things). So the more PDF 2.0 are around (and people complaining if tools do not handle this correctly) the better imho. So I would produce PDF 2.0 unless someone/something forces you to fallback to 1.7.
you shouldn't use xelatex for tagging. It can't handle real space chars properly. Use lualatex.
Tried to switch to lualatex... Found some issues, the main one at the moment is that we use:
usepackage{xeCJK}
That doesn't seem to work with lualatex. This answer recommends to use luatexja-fontspec
instead. But that seems to clash with other packages we are using (tabularx) 😥.
Can you explain a bit more on why we shouldn't use xelatex. Is this a you shouldn't use but it is ok ish OR more like you must absolutely avoid xelatex?
Can you explain a bit more on why we shouldn't use xelatex. Is this a you shouldn't use but it is ok ish
AsIwrotewithXeLaTeXonecan'tcurrentlyinsertrealspacechars,sofromtheperspectiveofaccessibilitytherearenowordspaces.Decideyourselfifyouwanttoinflictthisonyourusers.
Beside the problem of the spaces: xelatex is regarding tagging quite similar to pdflatex, you have to insert literals/specials everywhere and keep track of the state with labels. That is much less flexible than lualatex where one can use attributes and callbacks to change stuff after the typesetting.
I'm not aware of a clash of luatexja with tabularx, but this can probably be resolved. Make a minimal example that demonstrates the issue and ask e.g. on tex.stackexchange.
Can you explain a bit more on why we shouldn't use xelatex. Is this a you shouldn't use but it is ok ish
AsIwrotewithXeLaTeXonecan'tcurrentlyinsertrealspacechars,sofromtheperspectiveofaccessibilitytherearenowordspaces.Decideyourselfifyouwanttoinflictthisonyourusers.
not just "inflict", it simply means you can't produce value PDF/UA file can you? because that is a requirement to have explicit spaces.
Thanks, I guess I was confused because in the pdf I saw the spaces. But I see the spaces within the tags are broken.
I'm not aware of a clash of luatexja with tabularx, but this can probably be resolved. Make a minimal example that demonstrates the issue and ask e.g. on tex.stackexchange.
It actually looks related to tagging, maybe a new issue here?
\DocumentMetadata{
lang = en,
pdfversion = 2.0,
pdfstandard = ua-2,
testphase =
{phase-III,
table,
math,
firstaid}
}
\documentclass[10pt,a4paper]{report}
\usepackage{luatexja-fontspec}
\usepackage{tabularx}
\begin{document}
\begin{tabularx}{\textwidth}{|X|X|}
hello & hola
\end{tabularx}
\end{document}
$ lualatex latexdoc.tex
(./latexdoc.aux) (/opt/texlive/2024/texmf-dist/tex/latex/base/ts1cmr.fd)
Info: mathml file latexdoc-mathml does not exist
! Argument of \__math_grab_dollar:w has an extra }.
<inserted text>
\par
l.18 \end{tabularx}
?
Any of these fixes it:
math,
in line 8.\usepackage{luatexja-fontspec}
in line 12.(I can probably carry on removing "math" myself)
Removing the math will avoid the error, but the tagging of the table is broken nevertheless. (You get warnings like Package tagpdf Warning: Parent-Child 'P/pdf2' --> 'TR/pdf2'.
in the log and that means something is not right.)
Basically luatexja is currently not compatible as it overwrites internal tabular commands and so removes the tagging code. They should either remove the patches or adapt them to the new kernel code. I will open a new issue to track that.
I have this latex code:
Notice
/home/david/toinclude.pdf
is a very simple document.Then run:
It never seems to stop and my CPU goes to 100%.
If I remove
phase-III
and run it, it works just fine. But obviously no tags.