Open u-fischer opened 4 months ago
The build file expects only one output file, but with footnotes, we get separate HTML file for each footnote by default. So each footnote overwrites the .mml
file. Footnotes can be put to the main HTML file using the "fn-in" option, but it can be a good idea to support multiple files anyway.
This build file appends MathML code from all generated HTML files:
local domfilter = require "make4ht-domfilter"
local mkutils = require "mkutils"
local process = domfilter {"mathmlfixes", -- fix mathml first
function(dom, par)
-- if we output to several HTML files, we want to overwrite the mml file in the first processed
-- file, in the following files, we will append MathML to the already existing file
local current_name = mkutils.remove_extension(par.filename)
local mode = "a"
if current_name == par.input then mode = "w" end
local filename = par.input .. "-mathml.mml"
local f = io.open(filename, mode)
for count, math in ipairs(dom:query_selector "math") do
f:write("\n"..count.."\n")
f:write(math:serialize())
end
f:close()
return dom
end
}
Make:match("html", process)
Sorry for the late feedback, I got busy and it slipped my mind. The suggested change works fine, but sadly I can't use it for the intended use case as tex4ht doesn't work with the latex-lab code ;-(.
How can I try the latex-lab code? What is broken?
Well basically I'm trying to inject the hash we calculate into the output, but tex4ht disables the latex-lab-code (as it ignores \DocumentMetadata) and if I load it manually it errors. As an example:
\RequirePackage{tagpdf-base}
\RequirePackage{latex-lab-testphase-math}
\documentclass{article}
\AtBeginDocument{\Configure{math-xmlns}
{xmlns="http://www.w3.org/1998/Math/MathML" hash="abc" source="blub"}}
\begin{document}
$a=1$ and
$x=2$
a\footnote{blub $f = 3$}
\end{document}
compiled with make4ht test "mathml" gives
! You can't use `\unless' before `\relax'.
<to be read again>
\ifmeasuring@
l.1468 ...64748494A4B4C4D4E4F505152535455565758595A}
and various follow up errors.
I see. I was able to limit the number of errors to just one, with these config files.
usepackage.4ht
:
% usepackage.4ht (2024-04-18-14:01), generated from tex4ht-4ht.tex
% Copyright 2003-2009 Eitan M. Gurari
% Copyright 2009-2024 TeX Users Group
%
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
% http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
%
% This work has the LPPL maintenance status "maintained".
%
% The Current Maintainer of this work
% is the TeX4ht Project <http://tug.org/tex4ht>.
%
% If you modify this program, changing the
% version identification would be appreciated.
\immediate\write-1{version 2024-04-18-14:01}
\def\:temp{tex4ht}\ifx \:temp\@currname
\:warning{\string\usepackage{tex4ht} again?}
\def\:temp#1htex4ht.def,tex4ht.sty#2!*?: {\def\:temp{#2}}
\expandafter\:temp \@filelist htex4ht.def,tex4ht.sty!*?: %
\ifx \:temp\empty \else
\:warning{if
\string\RequirePackage[tex4ht]{hyperref} or
\string\usepackage[tex4ht]{hyperref} was
used try instead, repectively,
\string\RequirePackage{hyperref} or
\string\usepackage{hyperref}}
\fi
\fi
\gdef\a:usepackage{\use:package ,!*?: }
\gdef\use:package#1,{%
\if :#1:\def\:temp##1!*?: {}\else
\def\:temp{#1}\ifx \@currname\:temp
\def\:temp##1!*?: {\input usepackage.4ht }%
\else \let\:temp=\use:package \fi
\fi \:temp}
\Configure{PackageHooks}{titlesec.sty}{titlesec-hooks.4ht}
\Configure{PackageHooks}{multibib.sty}{multibib-hooks.4ht}
\Configure{PackageHooks}{biblatex-chicago.sty}{biblatex-chicago-hooks.4ht}
\Configure{PackageHooks}{cleveref.sty}{cleveref-hooks.4ht}
\Configure{PackageHooks}{xr.sty}{xr-hooks.4ht}
\Configure{PackageHooks}{xr-hyper.sty}{xrhyper-hooks.4ht}
\Configure{PackageHooks}{eso-pic.sty}{esopic-hooks.4ht}
\Configure{PackageHooks}{showframe.sty}{showframe-hooks.4ht}
\Configure{PackageHooks}{expl3.sty}{expl3-hooks.4ht}
\Configure{PackageHooks}{savetrees.sty}{savetrees-hooks.4ht}
\Configure{PackageHooks}{newcomputermodern.sty}{newcomputermodern-hooks.4ht}
\Configure{PackageHooks}{newcomputermodern.sty}{newcomputermodern-hooks.4ht}
\Configure{PackageHooks}{fontawesome5-utex-helper.sty}%
{fontawesome5-utex-helper-hooks.4ht}
\Configure{PackageHooks}{fontawesome5.sty}{fontawesome5-hooks.4ht}
\Configure{PackageHooks}{biblatex.sty}{biblatex-hooks.4ht}
\Configure{PackageHooks}{xeCJK.sty}{xecjk-hooks.4ht}
\Configure{PackageHooks}{unicode-math.sty}{unicode-math-hooks.4ht}
\Configure{PackageHooks}{ctex.sty}{ctex-hooks.4ht}
\AddToHook{class/ctexart/before}{\input{ctexart-hooks.4ht}}
\Configure{PackageHooks}{luatexja.sty}{luatexja-hooks.4ht}
\Configure{PackageHooks}{luatexja-fontspec.sty}{luatexja-hooks.4ht}
\Configure{PackageHooks}{polyglossia.sty}{polyglossia-hooks.4ht}
\Configure{PackageHooks}{fontspec.sty}{fontspec-hooks.4ht}
\Configure{PackageHooks}{tikz.sty}{tikz-hooks.4ht}
\Configure{PackageHooks}{pgf.sty}{pgf-hooks.4ht}
\Configure{PackageHooks}{pdfbase.sty}{pdfbase-hooks.4ht}
\Configure{PackageHooks}{pdfx.sty}{pdfx-hooks.4ht}
\Configure{PackageHooks}{lua-widow-control.sty}{lua-widow-control-hooks.4ht}
\Configure{PackageHooks}{tagpdf.sty}{tagpdf-hooks.4ht}
\Configure{PackageHooks}{accessibility.sty}{accessibility-hooks.4ht}
\Configure{PackageHooks}{embedfile.sty}{embedfile-hooks.4ht}
\Configure{PackageHooks}{breakurl.sty}{breakurl-hooks.4ht}
\Configure{PackageHooks}{hyperref.sty}{hyperref-hooks.4ht}
\Configure{PackageHooks}{bookmark.sty}{bookmark-hooks.4ht}
\Configure{PackageHooks}{draftwatermark.sty}{draftwatermark-hooks.4ht}
\AddToHook{package/tabu/before}{\RequirePackage{tabularx}}
\Configure{PackageHooks}{caption.sty}{caption-hooks.4ht}
\Configure{PackageHooks}{footnotebackref.sty}{footnotebackref-hooks.4ht}
\AddToHook{package/doc/before}{\SUPOff}
\AddToHook{package/doc/after}{\SUPOn}
\AddToHook{package/hypdoc/before}{\SUPOff}
\AddToHook{package/hypdoc/after}{\SUPOn}
\Configure{PackageHooks}{mathtools.sty}{mathtools-hooks.4ht}
\Configure{PackageHooks}{babel.sty}{babel-sty-hooks.4ht}
\Configure{PackageHooks}{minted.sty}{minted-sty-hooks.4ht}
\Configure{PackageHooks}{xyling.sty}{xyling-hooks.4ht}
\Configure{PackageHooks}{graphics.sty}{graphics-hooks.4ht}
\Configure{PackageHooks}{graphbox.sty}{graphbox-hooks.4ht}
\Configure{PackageHooks}{xcolor.sty}{xcolor-hooks.4ht}
\Configure{PackageHooks}{imakeidx.sty}{imakeidx-hooks.4ht}
\Configure{PackageHooks}{fancyhdr.sty}{fancyhdr-hooks.4ht}
\Configure{PackageHooks}{exerquiz.sty}{exerquiz-hooks.4ht}
\Configure{PackageHooks}{hyperxmp.sty}{hyperxmp-hooks.4ht}
\Configure{PackageHooks}{datetime2.sty}{datetime2-hooks.4ht}
\Configure{PackageHooks}{latex-lab-testphase-math.sty}{latex-lab-testphase-math-hooks.4ht}
\endinput
It just registers the following file latex-lab-testphase-math-hooks.4ht
, to be loaded once the latex-lab-testphase-math
package is loaded:
\ExplSyntaxOn
\:AtEndOfPackage{
\RequirePackage{amsmath}
\cs_set_protected:Npn \__tag_whatsits: {}
}
\ExplSyntaxOff
Just requiring amsmath
before \begin{document}
fixed most errors. The rest was fixed by redefinition of \__tag_whatsits:
, except this one:
! LaTeX Error: Control sequence \__tag_whatsits: already defined.
The resulting HTML code looks fine, math has the source
attribute.
OK, loading amsmath earlier clearly helped ;-). That is something that we could imho do rather easily in the latex-lab code.
\__tag_whatsits:
directly.mkparams.lua
and change in line 224 "dvilualatex"
into "dvilualatex-dev"
so that I can compile with the dev code (side remark: I think it would be good if make4ht had an option to compile with the -dev formats, that could prevent surprises at the next release).\def\DocumentMetadata#1{\def\:DocumentMetadata{#1}}
. Loading bits and pieces from the latex-lab code didn't work well. If something breaks with \DocumentMetadata with tex4ht then that should imho not be resolved by suppressing the code, but by correcting the code.With all these changes this here compiles with make4ht
\DocumentMetadata{testphase={phase-III,math}}
\RequirePackage{amsmath}
\documentclass[12pt]{article}
\ExplSyntaxOn
\socket_new_plug:nnn{tagsupport/math/inline/formula/begin}{make4ht}
{#1\tl_show:e{???\detokenize{#1}???}}
\cs_if_exist:NT\HCode
{\AssignSocketPlug{tagsupport/math/inline/formula/begin}{make4ht}}
\ExplSyntaxOff
\DebugSocketsOn
\begin{document}
some math $a=\int f(x)$
\end{document}
But if I add a display math or an amsmath environments like an align
it errors.
! Extra }, or forgotten $.
\endequation* ...:endequation*:\endcsname \egroup
\csname b:equation*\endcsn...
l.16 \[a=\int f(x)\]
A second problem is how to access the math content. If I use make4ht -l test
, then the socket shows ???a=\int f(x)???
in the log and that is fine. But with make4ht -l test "mathml"
I get something like this (besides of lots other math grabbing output):
???\aftergroup \b:mth \c:mth \fi \bool_if:NF \l__math_collected_bool
{\bool_set_true:N \l__math_collected_bool \__math_grab_dollar:w }a=\int
f(x)???.
And it is not trivial to get the real math content from it. I wonder is make4ht could not make use of the math grabbing code if it is there instead of patching the math?
Instead of redefining of mkparams.lua
, you can add the following line to the .mk4
file to request lualatex-dev
engine:
Make:htlatex {htlatex = "dvilualatex-dev"}
The htlatex
option can take any value, as long as it produces a DVI file. You can put multiple Make:htlatex
calls to require multiple compilations, but I don't think it is necessary in this case.
I will add declaration of \__tag_whatsits: {}
, but I am not sure about \DocumentMetadata
, resp packages loaded by phase-III-latex-lab-testphase.ltx
. Most of command and environment redefinitions are done in the begindocument/before
hook, but this happens after original commands were redefined in .4ht
files. These are loaded just before \begin{document}
. So all hooks for insertion of HTML tags are lost in commands and environments redefined in these packages. This happens for example to footnotes, itemize, etc.
We would need to redefine them again in phase-III-latex-lab-testphase.4ht
, if we used \AtBeginDocument
:
\AtBeginDocument{
\catcode`\:=11
\makeatletter
%redefinitions here
\catcode`\:=12
\makeatother
}
I even tried to \input{latex.4ht}
and other basic files here, but it only led to errors, and itemized
or footnotes didn't work anyway. So I am not sure what is a good solution here :/
Instead of redefining of mkparams.lua, you can add the following line to the .mk4 file to request lualatex-dev engine:
You mean the mk4 for the extraction of the mathml? My remark was more on the general side: imho all users should be able to test with an upcoming latex.
I will add declaration of __tag_whatsits: {}
That should not be needed, the next tagpdf update will handle that.
but I am not sure about \DocumentMetadata, resp packages loaded by phase-III-latex-lab-testphase.ltx. Most of command and environment redefinitions are done in the begindocument/before hook, but this happens after original commands were redefined in .4ht files.
I quite understand that there will be problems. But we are putting all this new code in testphase packages and latex-lab style files so that we can test it and identify problems and then find suitable solutions. All this is not possible with tex4ht if you simply disable \DocumentMetadata and the loading of the latex-lab code. The main goal of the code currently is to enable tagging but we also have in mind to simplify the html output, after all the structures are quite similar, and for this it is important to understand if and how the code that we add can be reused by tex4ht and others.
So please enable \DocumentMetadata, and if you see an error, report it at the tagging-project github.
I quite understand that there will be problems. But we are putting all this new code in testphase packages and latex-lab style files so that we can test it and identify problems and then find suitable solutions. All this is not possible with tex4ht if you simply disable \DocumentMetadata and the loading of the latex-lab code. The main goal of the code currently is to enable tagging but we also have in mind to simplify the html output, after all the structures are quite similar, and for this it is important to understand if and how the code that we add can be reused by tex4ht and others.
So please enable \DocumentMetadata, and if you see an error, report it at the tagging-project github.
OK, I've enabled \DocumentMetadata
in the TeX4ht sources. It doesn't cause fatal errors anymore, just the clash between macros redefined by both TeX4ht and tagpdf. Hopefully, we will be able to fix that.
I'm trying to extract all the mathml into an extra file. Based on https://chat.stackexchange.com/transcript/41?m=65070731#65070731 I tried with this extract-math.mk4:
I then call it for test-utf8.tex with
make4ht -l -e extract-math.mk4 test-utf8 "mathml"
This works fine for
and the files contains
But as soon as I uncomment the footnote in the example above the file is empty.