Problem with new LaTeX marks with \twocolumn

pietvo commented 1 month ago

Brief outline of the bug

I am using the new LaTeX marks mechanism with \twocolumn, but I don't get the expected result. I want the first section title and the first subsection title of each page in the header, but it doesn't work if these are in the second column. They do when they appear in the first column.

Minimal example showing the bug

\RequirePackage{latexbug} 
\documentclass[12pt]{article}
\usepackage{lipsum}

% begin header configuration
\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyhf{}
\fancyhead[L]{\FirstMark{2e-left}}  % first section
\fancyhead[R]{\FirstMark{2e-right-nonempty}}  % first subsection
\fancyfoot[C]{\thepage} 
% End fancyhdr settings

\renewcommand{\sectionmark}[1]{%
    \markboth{\thesection. #1}{}%
  }
\renewcommand{\subsectionmark}[1]{%
    \markright{\thesubsection. #1}%
  }

% End header configuration

\begin{document}
\twocolumn

\section{First section}

\subsection{First subsection}

\lipsum[1-7]

\section{Second section}

Introduction

\subsection{Second subsection}

\lipsum[10-12]

\end{document}

Log file (required) and possibly PDF file

x2dblmarks1M.log

pietvo commented 1 month ago

I have been looking in source2e/ltmarks.dtx and I think the problem is here:

What now remains doing is to update the page and previous-page regions. For this we have to copy the settings in page into previous-page and then update page such that the top and first marks are taken from the first-column region and the last marks are taken from the last-column region.

          \tl_gset_eq:cc { g__mark_page_top_           ##1 _tl }
                         { g__mark_first-column_top_   ##1 _tl }
          \tl_gset_eq:cc { g__mark_ page_first_        ##1 _tl }
                         { g__mark_first-column_first_ ##1 _tl }
          \tl_gset_eq:cc { g__mark_page_last_          ##1 _tl }
                         { g__mark_last-column_last_   ##1 _tl }

I think this is incorrect. If there is no mark in the first column then the first marks must be taken from the second column; similarly if there is no mark in the second column, the last marks must be taken from the first column. The decision is for each mark separately.

FrankMittelbach commented 1 month ago

Hi Pieter, unrelated to the the issue at hand, you can of course submit a bug report even if it loads a package not maintained by us: while it suggests to remove such packages or report the error elsewhere, it also says

(latexbug)                If you think the bug is in core LaTeX
(latexbug)                (as maintained by the LaTeX Team) but
(latexbug)                these files are needed to demonstrate
(latexbug)                the problem, please continue and mention
(latexbug)                this explicitly in your bug report
(latexbug)                (with an explanation why you think so).

which would be the situation in your case (even though you can also produce the bug of course just using a definition for \@oddhead and thus avoiding fancyhdr).

Concering the issue: yes a clear bug on my part and in fact visible in 2 test files that are incorrectly certified as correct. And yes the issues is where you point to. However if you think about it all that needs correcting is the logic for first not for last (or top), because if the second column has no marks at all, its lastwill still reflect the lastfrom the first column.

josephwright commented 1 month ago

@FrankMittelbach Do you think we can fix this for the release tomorrow/Saturday?

FrankMittelbach commented 1 month ago

Already done on my machine. But should also get an entry in ltnews I guess.

car222222 commented 1 month ago

@josephwright That seems to be mainly a question for you to answer! Since Frank has the fix, and we can I guess easily do something for ltnews.

pietvo commented 1 month ago

@FrankMittelbach I have a question about the test to see if there is a mark in the column. In TLC you suggest to compare the topmark and firstmark and if they are equal then there is no mark (of that specific class) in the page/column. But wouldn't that fail if the first mark happens to be the same as the last mark of the previous page. Probably a rare event, but not impossible. Or do you have a better test?

I now see that you applied the same test in the patch. So the question remains.

pietvo commented 1 month ago

I tried Frank's new code with a new test file that shows the problem in my previous comment. On page 2 the right header should say 'Introduction`. Instead it says 'Second subsection'. I.e., the new code skips the subsection in column 1 because it is the same as the one on the previous page, just as I suspected.

\RequirePackage{latexbug} 
\documentclass[12pt]{article}
%\usepackage{markpatch} % the new code for the page/column marks.
\usepackage{lipsum}

\newcommand\mysubsection[1]{\subsection*{#1\markright{#1}}}

% begin header configuration
\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyhf{}
\fancyhead[L]{\FirstMark{2e-left}}  % first section
\fancyhead[R]{\FirstMark{2e-right-nonempty}}  % first subsection
\fancyfoot[C]{\thepage} 
% End fancyhdr settings

\renewcommand{\sectionmark}[1]{%
    \markboth{\thesection. #1}{}%
  }
\renewcommand{\subsectionmark}[1]{%
    \markright{\thesubsection. #1}%
  }

% End header configuration

\begin{document}
\twocolumn

\section{First section}

\lipsum[1-2]
\mysubsection{Introduction}

\lipsum[3-5]

\section{Second section}

\mysubsection{Introduction}

\lipsum[6-7]

\mysubsection{Second subsection}

\lipsum[10-12]

\end{document}

FrankMittelbach commented 1 month ago

@pietvo That's interesting question. I guess if one has \InsertMark{class}{foo} twice in successive regions then it should be recognized that the second one does contain an explicit mark even if its "text" is equal to the previous one.

That could be easily enough repaired, by counting each mark and adding a hidden marker at the start of the text, e.g. \use_none:n{\int_use:N\g__mark_int}.

The question is: is it worth the complication? I mean, what kind of use cases would there be to have identical marks that are meaningful? Not saying there couldn't be, but I can't think of any off hand.

FrankMittelbach commented 1 month ago

@pietvo guess you made your point with the second example :-) saw that only after I wrote my reply.

pietvo commented 1 month ago

@FrankMittelbach I think the test for no marks in a region by comparing top and first marks is not robust. I have a suggestion how to do it, but it needs a small extension to the marks system.

Basically for each mark class/region combo , besides the top, first and last variables, you would need a boolean. Someting like { g__mark_#1_present_ ##1_bool } if you want it to express the presence of the mark in that region. Or if negative you could call it empty or so.

Then in the function \__mark_update_structure:nn #1#2

% NOTE: Set a boolean { g__mark_#1_present_ ##1_bool } to the INVERSE of \tl_if_empty:NTF \g__mark_tmp_tl next to the following code

83 \tl_if_empty:NTF \g__mark_tmp_tl
84   {
85     \tl_gset_eq:cN { g__mark_#1_last_ ##1_tl }
86       \g__mark_new_top_tl
87     \tl_gset_eq:cN { g__mark_#1_first_##1_tl }
88       \g__mark_new_top_tl
89 }

and then also make a function to use this in packages/documents.

File ltmarks.dtx (source2e, page 889).

FrankMittelbach commented 1 month ago

@pietvo guess I'm not that keen on this approach, because the same issue happens if you compare first and last mark to figure out if there is more than one mark on a page. I think it is better if \IfMarkEqual... is defined (and documented) of testing if two retrieved marks are from the same \InsertMark or not regardless of them showing the same content.

pietvo commented 1 month ago

@FrankMittelbach Why would you need to know if there is more than one mark on a page? If that is important, what about the question if there are more than n marks for any n? That isn't possible now for n>1. Would you then also support that?

But anyway, a robust test for the presence of marks would really be appreciated.

FrankMittelbach commented 1 month ago

For example, you might want to answer: "am I in the same subsection at the beginning of the page compared to the end of the page". If both subsections say "Introduction" as in your example I wouldn't find that out if all marks are on a single page, but I would know if I know that there is more than one mark on the page.

Supporting n>1, I guess the anser is no (not out of the box) but for special applications you can easily arrange for that using suitable mark content.

pietvo commented 1 month ago

@FrankMittelbach The point with my solution is that the information (is there a mark in the region?) is available at that moment. So catch it while you can. And moreover, in case the region is the first column, you need the information in the kernel to construct the firstmark for the page.

FrankMittelbach commented 1 month ago

@pietvo I think it it would get more complicated compared to ensuring that all marks are unique. For example, with the "page" region you would still need extra processing to look at the individuals columns to determine if the region has marks and update the boolean accordingly. What I actually also like about the mark counter is that this aids in debugging as you can then follow the generated marks much easier. But I agree it could be implemented in that way.

We can, of course, decide that this isn't ripe to be fixed in the upcoming release, but in my opinion the fix works well and doesn't require documentation changes or new interfaces to interogate the state of the marks.

pietvo commented 1 month ago

@FrankMittelbach Are you planning to provide a function to extract the original text (i.e. without the sequence number) from a mark? For example in case you want to do some extra processing in a header.

FrankMittelbach commented 1 month ago

@pietvo what kind of extra processing would you have in mind where a nonprinting id actually does any harm?

I had not planned anything for this release, but I agree it would be nicer if \FirstMark and friends drop the id when extracting the data. That will come in a followup, but most likely not right now (unless Joeseph gives me the green light to push another change).

pietvo commented 1 month ago

@pietvo what kind of extra processing would you have in mind where a nonprinting id actually does any harm?

I have an example where I want to know if the text at the top of the page is at the section or subsection level. I do this by having an extra mark which is set to ‘section’ by the \sectionmark, and similar for subsection. Or use different numbers. In the header/footer I can then check that value, but then I need the original text without any implementation addons.

I can also think of cases where you want to do some calculations with the values of the marks.

FrankMittelbach commented 1 month ago

@pietvo what kind of extra processing would you have in mind where a nonprinting id actually does any harm?

I have an example where I want to know if the text at the top of the page is at the section or subsection level. I do this by having an extra mark which is set to ‘section’ by the \sectionmark, and similar for subsection. Or use different numbers. In the header/footer I can then check that value, but then I need the original text without any implementation addons.

I can also think of cases where you want to do some calculations with the values of the marks.

I think both cases would also work if there is something expandable at the beginning that vanishes, e.g., assigning the mark value to a counter would just ignore the id. But you are right it complicates things and we have already included a change to \FirstMark and friends so that they already remove the id when retrieving the mark data.

pietvo commented 1 month ago

@FrankMittelbach Great! Implementation details shouldn't leak to the users.

Background info: A few years ago I implemented my extramarks package with new marks based on the primitive etex marks, before the new LaTeX marks mechanism was released. No regions, only page. I never released this. Of course it had to patch kernel code (with etoolbox's \patchmcd). It also patched packages like multicol and paracol if they were loaded. I had several test documents and It worked great.

A few weeks ago I decided that this wasn't future-proof, so I reimplemented it with the new LaTeX marks mechanism. And then some tests failed, and this is just one of them.

The next one will be multicol but obviously that will have to wait...

latex3 / latex2e