latex3 / tagpdf

Tagging support code for LaTeX
60 stars 7 forks source link

Problems with mcids when using \maketitle and sections #24

Closed AlexandrKozlovskiy closed 3 years ago

AlexandrKozlovskiy commented 4 years ago

When i compile this document,i expected get structure elements (with one different mcid each),but in lualatex i get for first structure element two mcids (0 and 1 (when i use package from issue23 branch i get only mcid 0)),for second element i get two mcids 0 and 0,and for third element i get null,i.e empty mc. In pdflatex i get three struct elements with mc 0,but if i use package from issue23 branch,i get structure tree with mcids 0,1 and 2 i.e as i expect. issue_sect.txt

u-fischer commented 4 years ago

you have three pages and the numbering of the mcids restarts on every page. So with pdflatex you should get mcid0 everywhere (if you compile often enough).

With luatex it can happen that empty nodes or glue create empty BDC-EMC pairs, e.g.

/H1<</MCID 0>> BDC
EMC
/Artifact BMC
EMC

I don't think that they do harm (apart from enlarging the pdf a bit), but I will consider later if one can avoid at least some of them.

AlexandrKozlovskiy commented 4 years ago

But my document have only one page. About inserting of extra mode,it's,in my opinion,very critical issue,because for example in adobe reader i see only first title,at least nvda read this to me. Empty node inserted instead third h2,but in generic mode i haven't problem with this. Inho you should rewrite function l3kernel.__tag.func.mark_page_elements (box,mcpagecnt,mccntprev,mcopen,name,mctypeprev). Please,test my document in lualatex to confirm or deny this problem.

u-fischer commented 4 years ago

Your document uses book class and has a \maketitle and a \chapter command. This creates three pages.

Beside this: you get the additional empty BDC/EMC pairs because your tagging is "lazy". If you would patch \maketitle to add the commands only around the text it wouldn't include footer and header. Similar for the chapter command.

I will nevertheless look if there is a way to avoid or at least reduce such unwanted artifacts, but not today.

AlexandrKozlovskiy commented 4 years ago

But why for \paragraph i have empty mc instead of H2 in structure tree? Ok,how i can now solve this issue? Imho in this moment you can solve this issue if you simple will close all opened by user mc tag in previous page and open it in the next page without inserting any extra nodes. I not understand,why you insert pdf literals for each vlist or hlist on the page,because you can,for example,simple,if pdf literal opened,add this in table,and if it closed - remove this literal from the table. If after traversing of all nodes at the page you see some literals on the table,it means,that you have not closed literals,so you must insert emc nodes in previous page and open all mcs on the next page.

u-fischer commented 4 years ago

you don't have an empty mc from \paragraph but from the \chapter.

And the problem is not to find out if one or more emc or bdc are missing, but where.

AlexandrKozlovskiy commented 4 years ago

No,it's problem from paragraph to. You can sure in it,if you will see new my test. And problem now how to do tagging of sections,to solve this issue. Yes,i can use genericmode,but i want to solve this problem in lua mode to,so i hope,that you will fix this issue,if it impossible solve this problem now,because without fixing it pdf becomes incorrect. issue_sect_new.txt

u-fischer commented 4 years ago

Oh this. \paragraph doesn't start directly horizontal mode. Use \paragraph{test of paragraph}\leavevmode. Or patch the command so that the tagging commands are around the text.

AlexandrKozlovskiy commented 4 years ago

Yes,now situation improved,but in lua mode in lualatex for H1 i have 2 mc and for paragraph i have only one not empty mc as i expected. So for paragraph i have only text 2 in adobe reader,but it must be also text test of paragraph. issue_par.txt

u-fischer commented 4 years ago

We are back to the start: The tagging commands are quite low-level commands. There is no garanty that they do what you expect if you put them around complicated document commands like \maketitle with lots of internal structure.

If you don't get the correct structure in such cases it is not a bug, it indicates that you didn't put the tagging commands in the correct place.

I wrote the package to investigate the correct places for tagging commands so that we can add useful hooks to the latex kernel and relevant packages. It is not meant as a package for the casual user who wants that tagging simply works.

AlexandrKozlovskiy commented 4 years ago

In my opinion it's a bug,because it happends only in lualatex in lua mode,i.e all nodes,including extra nodes,you inserted yourself,so you can improve your lua algorithm,and it will be works ok.