gusbrs / postnotes

Endnotes for LaTeX
LaTeX Project Public License v1.3c
2 stars 1 forks source link

`style=endnotes` and tagging error #8

Closed gusbrs closed 3 weeks ago

gusbrs commented 4 weeks ago

Consider the following document:

\DocumentMetadata{
  testphase={phase-III} ,
  pdfversion=2.0,
  pdfstandard=ua-2,
  lang=en,
}

\documentclass{book}

\usepackage{postnotes}

\postnotesetup{style=endnotes}
% \postnotesetup{
%   format =
%     {
%       % \footnotesize
%       \setlength { \rightskip } { 0pt   }
%       \setlength { \leftskip  } { 0pt   }
%       \setlength { \parindent } { 1.8em }
%     } ,
% }

\usepackage{lipsum}
\usepackage{hyperref}
\hypersetup{hidelinks}

\title{Title}

\begin{document}

\chapter{Chapter 1}

\lipsum[1]\postnote[label=en:mark:1]{\label{en:text:1}\lipsum[2]}

\postnoteref{en:text:1}\par
\postnoteref{en:mark:1}

\lipsum[3]\postnote{\lipsum[4]}

\lipsum[3]\postnote{\lipsum[4]}

\chapter{Chapter 2}

\lipsum[5]\postnote[label=en:mark:3]{\label{en:text:3}\lipsum[6]}

\postnoteref{en:text:3}\par
\postnoteref{en:mark:3}

\lipsum[7]\postnote{\lipsum[8]}

\lipsum[7]\postnote{\lipsum[8]}

\chapter{Chapter 3}

\lipsum[1]\postnote{\lipsum[2]}

\lipsum[3]\postnote{\lipsum[4]}

\lipsum[5]\postnote{\lipsum[6]}

\printpostnotes

\end{document}

Compiling with pdflatex, this document errors with:

[7]

./document.tex:61: Package tagpdf Error: required tag missing - mcid 117

Type <return> to continue.
 ...                                              

l.61 \printpostnotes

Bisecting the code shows the trigger to this mc mismatch is the call to \footnotesize in the format option of the endnotes style. Indeed, uncommenting:

\postnotesetup{
  format =
    {
      % \footnotesize
      \setlength { \rightskip } { 0pt   }
      \setlength { \leftskip  } { 0pt   }
      \setlength { \parindent } { 1.8em }
    } ,
}

Brings the document back to working condition.

The format option value is called right after the postnotes/printlist/begin tagging plug:

https://github.com/gusbrs/postnotes/blob/394fdf1cff4e07c942f11d7e04e09eebf596f38f/postnotes.dtx#L1581-L1582

which currently essentially disables para/tagging.

I played with some order alternatives a little, but I don't really understand why the call to \footnotesize has any bearing on things, so I'm not sure on how to proceed. If anyone has any thoughts on this, it'd be most welcome.

gusbrs commented 4 weeks ago

Further investigation shows that changing \footnotesize to \small brings the document back to working:

\postnotesetup{
  format =
    {
      % \footnotesize
      \small
      \setlength { \rightskip } { 0pt   }
      \setlength { \leftskip  } { 0pt   }
      \setlength { \parindent } { 1.8em }
    } ,
}

Also if, instead of doing anything in format (comment the block again), we simply make the first note of chapter 1 longer:

\lipsum[1]\postnote[label=en:mark:1]{\label{en:text:1}\lipsum[2-3]}

the error goes away.

Which leads me to conclude this is not related to the content of format but rather some page crossing condition reached by (un)luck in the first document. The fact that only one mcid goes wrong for multiple notes also points to a page boundary issue. This may indicate I'm not handling the mcs like I should (in general).

gusbrs commented 4 weeks ago

Further test of interest: no error with lualatex. With xelatex the problem is also present.

gusbrs commented 4 weeks ago

Tracking the code path.

The error is one of mc-tag-missing, issued by \__tag_check_mc_tag:N. Presumably the call from tagpdf-mc-code-generic.sty.

So I tried to follow the mcs. I included calls to \iow_term:e { OI!~\__tag_get_mc_abs_cnt: } in the postnotes/printmark/begin plug and to \iow_term:e { OI!!~\__tag_get_mc_abs_cnt: } in the postnotes/printtext/begin plug. As far as \tag_mc_begin:n postnotes issues, these come in immediate sequence. And I get in the log:

OI! 114
[... other stuff "not OI"]
OI!! 119

In other words, as far as I can tell, whoever is creating \tag_mc_begin:n with id 117, it is not postnotes.

Considering the postnotes/printmark/end closes the mc and the struct postnotes/printmark/begin has opened, I'm not sure what postnotes could do differently.

In other words, I'm still at a loss.

gusbrs commented 4 weeks ago

The page boundary condition seems to be that the note number 7 (the first \postnote on chapter 3) has only one line remaining to be typeset on page 8.

If we have the first note of chapter 3 be:

\lipsum[1]\postnote{\lipsum[2] aaa}

we still get only one dangling line on page 8, and the error still occurs. However, if we do:

\lipsum[1]\postnote{\lipsum[2] aaa aaa}

we get a second line, and the error is gone.

gusbrs commented 3 weeks ago

More testing.

First, the following document:

\DocumentMetadata{
  testphase={phase-III} ,
  pdfversion=2.0,
  pdfstandard=ua-2,
  lang=en,
  debug={log=v,tagpdf},
}

\documentclass{book}

\usepackage{lipsum}
\usepackage{hyperref}
\hypersetup{hidelinks}
\usepackage{indentfirst}

\title{Title}

\begin{document}

% This generates the same situation but, alas, not mc error...
\chapter*{Notes}

\begingroup
\footnotesize
\setlength{\rightskip}{0pt}
\setlength{\leftskip}{0pt}
\setlength{\parindent}{1.8em}

\lipsum[2]\par
\lipsum[4]\par
\lipsum[4]\par
\lipsum[6]\par
\lipsum[8]\par
\lipsum[8]\par
\lipsum[2]\par
\lipsum[4]\par
\lipsum[6]

\endgroup

\end{document}

generates the same paragraphs, with the same page break. Since this document runs without problems, there must be some interaction.

On the other hand, simply disabling hyperref in the problematic document also brings it to working condition. Indeed, enabling debug={log=all,tagpdf}, we get the following log snippet:

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{Link}{Link}}
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

tagpdf DEBUG Info: MC begin inserted with options: tag=Link [on line 63]

Package tagpdf Info: Parent-Child 'Link' --> 'MC'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Link/pdf2' --> 'MC' on line 63

tagpdf DEBUG Info: MC end inserted [on line 63]

Package tagpdf Info: closing structure 116 tagged /Link

tagpdf DEBUG Info: Struct end inserted [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

Package tagpdf Info:  has been removed from the mc stack

The sequence \g__tag_mc_stack_seq is empty
> .

tagpdf DEBUG Info: MC begin inserted with options: tag=\l__tag_tmpa_tl , [on
(tagpdf DEBUG)     line 63]

./document.tex:63: Package tagpdf Error: required tag missing - mcid 117

Type <return> to continue.
 ...                                              

l.63 \printpostnotes

LaTeX does not know anything more about this error, sorry.

Try typing <return> to proceed.
If that doesn't work, type X <return> to quit.

Package tagpdf Warning: tag  is not known

I'm not sure I can read this well. But, as far as I do, after Link is closed, one too many mcs get removed from the stack. For some reason, at this point, in this case...

gusbrs commented 3 weeks ago

More info. Since this is happening because of some condition triggered by the note content of note 7, I added an inspection point for \seq_show:N \g__tag_mc_stack_seq at the end of the tagsupport/postnotes/printtext/end. And I think I found the empty tag.

The sequence \g__tag_mc_stack_seq is empty
> .
<recently read> }

l.63 \printpostnotes

Package tagpdf Info: closing structure 109 tagged /endnote

tagpdf DEBUG Info: Struct end inserted [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

Package tagpdf Info: Parent-Child 'Sect' --> 'FENote'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Sect/pdf2' --> 'endnote/user' on line 63

tagpdf DEBUG Info: Struct 114 begin inserted with options:
(tagpdf DEBUG)     tag=endnote,attribute-class=EndnoteType,label={postnote.\l_p
ostnotes_print_note_id_tl
(tagpdf DEBUG)     },ref={postnotemark.\l_postnotes_print_note_id_tl }, 
(tagpdf DEBUG)     [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

Package tagpdf Info: Parent-Child 'FENote' --> 'Lbl'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'FENote/pdf2' --> 'endnotelabel/user' on
(tagpdf)             line 63

tagpdf DEBUG Info: Struct 115 begin inserted with options: tag=endnotelabel 
(tagpdf DEBUG)     [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

tagpdf DEBUG Info: MC begin inserted with options: tag=Lbl [on line 63]

Package tagpdf Info: Parent-Child 'Lbl' --> 'MC'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Lbl/pdf2' --> 'MC' on line 63

The sequence \g__tag_mc_main_marks_seq contains the items (without outer
braces):
>  {b+}
>  {113}
>  {113}
>  {text}.

tagpdf DEBUG Info: MC begin inserted with options: artifact [on line 63]

tagpdf DEBUG Info: Tagging suspended
(tagpdf DEBUG)     level: 0 ==> 1, label: headfoot [on line 63]

tagpdf DEBUG Info: Tagging resumed
(tagpdf DEBUG)     level: 1 ==> 0, label: headfoot [on line 63]

tagpdf DEBUG Info: MC end inserted [on line 63]

tagpdf DEBUG Info: MC begin inserted with options: artifact [on line 63]

tagpdf DEBUG Info: Tagging suspended
(tagpdf DEBUG)     level: 0 ==> 1, label: headfoot [on line 63]

tagpdf DEBUG Info: Tagging resumed
(tagpdf DEBUG)     level: 1 ==> 0, label: headfoot [on line 63]

tagpdf DEBUG Info: MC end inserted [on line 63]

[7]

Package tagpdf Info:  has been pushed to the mc stack

The sequence \g__tag_mc_stack_seq contains the items (without outer braces):
>  {}.

tagpdf DEBUG Info: MC end inserted [on line 63]

Package tagpdf Info: Parent-Child 'Lbl' --> 'Link'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Lbl/pdf2' --> 'Link/pdf2' on line 63

tagpdf DEBUG Info: Struct 116 begin inserted with options: tag=Link 
(tagpdf DEBUG)     [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{Link}{Link}}
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

tagpdf DEBUG Info: MC begin inserted with options: tag=Link [on line 63]

Package tagpdf Info: Parent-Child 'Link' --> 'MC'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Link/pdf2' --> 'MC' on line 63

tagpdf DEBUG Info: MC end inserted [on line 63]

Package tagpdf Info: closing structure 116 tagged /Link

tagpdf DEBUG Info: Struct end inserted [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

Package tagpdf Info:  has been removed from the mc stack

The sequence \g__tag_mc_stack_seq is empty
> .

tagpdf DEBUG Info: MC begin inserted with options: tag=\l__tag_tmpa_tl , [on
(tagpdf DEBUG)     line 63]

./document.tex:63: Package tagpdf Error: required tag missing - mcid 117

Type <return> to continue.
 ...                                              

l.63 \printpostnotes

LaTeX does not know anything more about this error, sorry.

Try typing <return> to proceed.
If that doesn't work, type X <return> to quit.

It occurs immediately after the page is shipped out, a selection:

[7]

Package tagpdf Info:  has been pushed to the mc stack

The sequence \g__tag_mc_stack_seq contains the items (without outer braces):
>  {}.

That's the only place in the document where an empty value ends up in the mc stack. And it seems to be the one.

I still don't know why the document leads to this result. But, at this point, the accumulated information suggests that this mc is really not in postnotes' control...

gusbrs commented 3 weeks ago

A little more, in line with the previous comment. I examine the logs compiling with lualatex (with debug={log=vv,tagpdf},) which, as previously reported, succeeds. At the shipout of page 7 we find:

tagpdf DEBUG Info: Struct 115 begin inserted with options: tag=endnotelabel 
(tagpdf DEBUG)     [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

Package tagpdf Info: Parent-Child 'Lbl' --> 'MC'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Lbl/pdf2' --> 'MC' on line 63

Package tagpdf Info: Lbl has been pushed to the mc stack

tagpdf DEBUG Info: Tagging suspended
(tagpdf DEBUG)     level: 0 ==> 1, label: headfoot [on line 63]

tagpdf DEBUG Info: Tagging resumed
(tagpdf DEBUG)     level: 1 ==> 0, label: headfoot [on line 63]

Package tagpdf Info: Lbl has been removed from the mc stack

Package tagpdf Info: Parent-Child 'Lbl' --> 'MC'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Lbl/pdf2' --> 'MC' on line 63

Package tagpdf Info: Lbl has been pushed to the mc stack

tagpdf DEBUG Info: Tagging suspended
(tagpdf DEBUG)     level: 0 ==> 1, label: headfoot [on line 63]

tagpdf DEBUG Info: Tagging resumed
(tagpdf DEBUG)     level: 1 ==> 0, label: headfoot [on line 63]

Package tagpdf Info: Lbl has been removed from the mc stack

Package tagpdf Info: Parent-Child 'Lbl' --> 'MC'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Lbl/pdf2' --> 'MC' on line 63

tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact
tagpdf: INFO TAG-NOT-TAGGED: this has not been tagged, using artifact [7]

Package tagpdf Info: Lbl has been pushed to the mc stack

Package tagpdf Info: Parent-Child 'Lbl' --> 'Link'.
(tagpdf)             Relation is 1 (='0..n')
(tagpdf)             Rolemapped from 'Lbl/pdf2' --> 'Link/pdf2' on line 63

tagpdf DEBUG Info: Struct 116 begin inserted with options: tag=Link 
(tagpdf DEBUG)     [on line 63]

The sequence \g__tag_struct_tag_stack_seq contains the items (without outer
braces):
>  {{Link}{Link}}
>  {{endnotelabel}{Lbl}}
>  {{endnote}{FENote}}
>  {{Sect}{Sect}}
>  {{Document}{Document}}
>  {{Root}{StructTreeRoot}}.

With debug={log=vvv,tagpdf}, it is even clearer, but there's too much in between the opening of the enotelabel struct and the restoring of the mc after the shipout to grasp the sequence of events. But after the shipout, we find:

tagpdf: INFO SHIPOUT-INSERT-LAST-EMC: MCOPEN 0 [7]

Package tagpdf Info: Lbl has been pushed to the mc stack

The sequence \g__tag_mc_stack_seq contains the items (without outer braces):
>  {Lbl}.

As far as I understand, what is going on is the following: when shipping out page 7, the struct with tag endnotelabel (role Lbl) for note 8 is opened the mc explicitly tagged tag=Lbl is then opened too. At this point, the OR decides to cut the page. And somehow, for reasons I have no idea, when restoring the mc after the shipout, the mc itself is not lost, but its tag is. Well, that's my best shot at interpreting this issue so far.

gusbrs commented 3 weeks ago

Ha! A M(non)WE without postnotes:

\DocumentMetadata{
  testphase={phase-III} ,
  pdfversion=2.0,
  pdfstandard=ua-2,
  lang=en,
  debug={log=all,tagpdf},
}

\documentclass{book}

\usepackage{lipsum}
\usepackage{hyperref}
\usepackage{indentfirst}

\ExplSyntaxOn
\newcommand{\makemark}[1]{
  \tag_struct_begin:n { tag=Lbl }
  \tag_mc_begin:n { tag=Lbl }
  \makebox[0em][r]{\hyperlink{myanchor}{\textsuperscript{#1}}}
  \tag_mc_end:
  \tag_struct_end:
}
\newcommand{\makelipsum}[1]{
  \tag_tool:n { para/tagging=true }
  \lipsum[#1]
  \tag_tool:n { para/tagging=false }
}
\ExplSyntaxOff

\title{Title}

\begin{document}

\chapter*{Notes}
\MakeLinkTarget*{myanchor}

\begingroup
\tagtool{para/tagging=false}
\footnotesize
\setlength{\rightskip}{0pt}
\setlength{\leftskip}{0pt}
\setlength{\parindent}{1.8em}

\makemark{1}\makelipsum{2}\par
\makemark{2}\makelipsum{4}\par
\makemark{3}\makelipsum{4}\par
\makemark{4}\makelipsum{6}\par
\makemark{5}\makelipsum{8}\par
\makemark{6}\makelipsum{8}\par
\makemark{7}\makelipsum{2}\par
\makemark{8}\makelipsum{4}\par
\makemark{9}\makelipsum{6}

\endgroup

\end{document}

With pdflatex, that document errors with:

Package tagpdf Info:  has been removed from the mc stack

The sequence \g__tag_mc_stack_seq is empty
> .

tagpdf DEBUG Info: MC begin inserted with options: tag=\l__tag_tmpa_tl , [on
(tagpdf DEBUG)     line 52]

./document3.tex:52: Package tagpdf Error: required tag missing - mcid 26

Type <return> to continue.
 ...                                              

l.52 \makemark{8}
                 \makelipsum{4}\par

LaTeX does not know anything more about this error, sorry.

Try typing <return> to proceed.
If that doesn't work, type X <return> to quit.

Package tagpdf Warning: tag  is not known
u-fischer commented 3 weeks ago

The problem is this here

\tag_mc_begin:n { tag=Lbl }
  \makebox[0em][r]{\hyperlink{myanchor}{\textsuperscript{#1}}}
 \tag_mc_end:

You are issuing the mc-begin command in vmode. I'm not sure yet, if tagpdf could/should catch this case, but generally it is not a good idea to put whatsits there (a hyperref anchor would e.g. probably end on the wrong page. So either put the mc-commands inside the box, or add a \leavevmode before the \tag_mc_begin:n.

gusbrs commented 3 weeks ago

@u-fischer Thank you very much for taking a look at this.

As usual, spot on. :wink:

I followed your suggestion and added \mode_leave_vertical: before starting the note in case of no list environment. Given the handling of the different cases, it seemed the best option. As far as I can tell, it's all looking good now. Thank you!