michal-h21 / make4ht

Build system for tex4ht
141 stars 15 forks source link

Exact same LaTeX macro results in different HTML output #151

Closed rhelder closed 1 month ago

rhelder commented 2 months ago

Suppose you were writing a very simple class based on the article class (let's call it dummy.cls). The main point of the class is to redefine \@maketitle:

\LoadClass{article}

\RenewDocumentCommand{\title}{m}{\renewcommand*{\@title}{#1}}

\RenewDocumentCommand{\@maketitle}{}{%
    \begin{center}
        \LARGE
        \@title
    \end{center}

    \begin{center}
        \begin{tabular}{lcr}
            3--4 p.m. & Russell Wright Helder & UC Berkeley \\
            Room 94 & (You can call me Russ) & Fall 2024 \\
        \end{tabular}
    \end{center}
}

(as you might have guessed, this is for the syllabus of a class – the real version consists of macros for class name, time, etc., but I've done away with that for simplicity).

Consider the following two examples (note that the same result would occur if we used article as our document class and just included the body of dummy.cls in our preamble):

Example 1

\documentclass{dummy}

\title{Name of Class}

\begin{document}
\maketitle
\end{document}

Example 2

\documentclass{dummy}

\title{Name of Class}

\NewCommandCopy{\makeinfo}{\maketitle}

\begin{document}
\makeinfo
\end{document}

These two examples produce very different html output. Given the first example, make4ht doesn't render the text in the tabular environment as a table:

<!DOCTYPE html> 
<html lang='en-US' xml:lang='en-US'> 
<head><title>Name of Class</title> 
<meta charset='utf-8' /> 
<meta content='TeX4ht (https://tug.org/tex4ht/)' name='generator' /> 
<meta content='width=device-width,initial-scale=1' name='viewport' /> 
<link href='minimal.css' rel='stylesheet' type='text/css' /> 
<meta content='minimal.tex' name='src' /> 
</head><body>
   <div class='maketitle'>

<h2 class='titleHead'>Name of Class</h2>
3–4 p.m. Russell Wright Helder UC Berkeley
<br />Room 94(You can call me Russ)   Fall 2024<br /><br />
   </div>

</body> 
</html>

Given the second example, make4ht renders the tabular environment correctly:

<!DOCTYPE html> 
<html lang='en-US' xml:lang='en-US'> 
<head><title></title> 
<meta charset='utf-8' /> 
<meta content='TeX4ht (https://tug.org/tex4ht/)' name='generator' /> 
<meta content='width=device-width,initial-scale=1' name='viewport' /> 
<link href='minimal2.css' rel='stylesheet' type='text/css' /> 
<meta content='minimal2.tex' name='src' /> 
</head><body>
<!-- l. 8 --><p class='noindent'>

</p>
<div class='center'>
<!-- l. 8 --><p class='noindent'>
</p><!-- l. 8 --><p class='noindent'><span class='cmr-17'>Name of Class</span></p></div>
<div class='center'>
<!-- l. 8 --><p class='noindent'>
</p>
<div class='tabular'> <table class='tabular' id='TBL-1'><colgroup id='TBL-1-1g'><col id='TBL-1-1' /><col id='TBL-1-2' /><col id='TBL-1-3' /></colgroup><tr id='TBL-1-1-' style='vertical-align:baseline;'><td class='td11' id='TBL-1-1-1' style='white-space:nowrap; text-align:left;'>3–4 p.m.</td><td class='td11' id='TBL-1-1-2' style='white-space:nowrap; text-align:center;'> Russell Wright Helder </td><td class='td11' id='TBL-1-1-3' style='white-space:nowrap; text-align:right;'>UC Berkeley</td>
</tr><tr id='TBL-1-2-' style='vertical-align:baseline;'><td class='td11' id='TBL-1-2-1' style='white-space:nowrap; text-align:left;'>Room 94 </td><td class='td11' id='TBL-1-2-2' style='white-space:nowrap; text-align:center;'>(You can call me Russ)</td><td class='td11' id='TBL-1-2-3' style='white-space:nowrap; text-align:right;'>   Fall 2024</td>
</tr></table>                                                        </div></div>

</body> 
</html>

This is unexpected behavior first of all because make4ht is producing different output from the exact same input; whether the name of the macro is maketitle or makeinfo, the definition is the same (makeinfo is literally a copy of maketitle). Clearly there is some special handling of the maketitle macro (I notice that 'maketitle' is a class attribute in the first html text), but that shouldn't mean that we get fundamentally different output from the same input.

More practically, this is a headache if you're writing a document class, and you've already defined maketitle to look the way you want – you actually really don't want make4ht to do any special handling (unless something doesn't translate as well as you'd like into html/css, in which case you can always just use a configuration file to touch things up).

At the moment, I cannot find a way to get make4ht to recognize the tabular environment as a table as long as the macro is named \maketitle (I'm a new user, so I may have missed something – but I've been looking through docs all day). Since the reason that I use and love LaTeX is that I love logical structure, changing the name of \maketitle on an ad hoc basis to get the result I want is abhorrent to me, so it would be great if there were a better solution.

Thank you for your work. I know I've inconveniently made two bug reports in a day, but this program does a really hard job very well.

michal-h21 commented 2 months ago

TeX4ht needs to redefine \maketitle and many other basic commands to produce correctly tagged HTML. In particular, the configuration for \maketitle locally configures tables to produce almost no formatting, so this is probably the reason why you cannot get the result you want.

Fortunately, it is usually possible to revert changes to the original definition of the commands.

In your case, you can write a configuration file for your class, dummy.4ht:

\AtBeginDocument{
\let\maketitle\o:maketitle:
}
\Hinput{dummy}
\endinput

The \let\maketitle\o:maketitle: reverts to the original version of \maketitle. It needs to use \AtBeginDocument because otherwise, it would be redefined by the version of \maketitle from the configuration file for the Article class, article.4ht. Configuration files are loaded in the order as they are included in the source file, so dummy.4ht is called before article.4ht.

rhelder commented 2 months ago

Thank you, this does solve my problem exactly! The redefinition of \maketitle back to \o:maketitle: is clear (you can see where \o:maketitle: is defined in the default article.4ht file), but may I ask what role \Hinput{dummy} and \endinput are playing? I can't find documentation on \Hinput, except at https://www.kodymirus.cz/tex4ht-doc/ForDevelopers.html, where the author says 'The \Hinput expects package name as it’s argument. It registers it for the latter processing in the output format files'. I don't quite understand what this means. In any event, if I remove \Hinput{dummy} and \endinput, I get the exact same html output. I do see that e.g. article.4ht is ended by \Hinput{article} and \endinput as well, so I see that your code is in keeping with that pattern. If you're willing to explain, I would be even more grateful than I already am! Thanks again.

michal-h21 commented 2 months ago

\endinput is used commonly in LaTeX packages at the end, so we just keep the pattern here. \Hinput{dummy} would load configurations in the output format .4ht file (like html4.4ht). It is used in all \ConfigureHinput{packagename} commands. This isn't important or useful in this simple case, but it would be important if dummy.4ht was incorporated to TeX4ht sources.

rhelder commented 1 month ago

Thank you, that's a helpful explanation - and thanks again for helping me solve my problem.