bdtc / lwarp

The LaTeX lwarp package — Convert LaTeX to HTML.
https://ctan.org/pkg/lwarp
31 stars 0 forks source link

Suggestion: put MathJax customizations into script tag #17

Closed DominikPeters closed 6 months ago

DominikPeters commented 7 months ago

lwarp adds many mathjax command definitions in a hidden div at the start of the document:

<!--MathJax customizations:-->
<div class="hidden">
\(\newcommand{\footnotename}{footnote}\)
\(\def \LWRfootnote {1}\)
\(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\let \LWRorighspace \hspace \)
\(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\)
\(\newcommand {\mathnormal }[1]{{#1}}\)
\(\newcommand \ensuremath [1]{#1}\)
\(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \)
...

I've had search engines (I believe bing) get confused by this and use it as text snippets in search results. Based on the MathJax documentation, I believe that most of these definitions could instead be put in the "Lwarp MathJax emulation code" inside the <script> tag that specifies the mathjax options, along the following lines (the following written with the help of ChatGPT-4):

MathJax = {
  tex: {
    macros: {
      footnotename: "footnote",
      LWRfootnote: "1",
      footnote: ["{}^{\\mathrm{#1}}", 1], // Adjusted for the optional argument and default value handling
      footnotemark: ["{}^{\\mathrm{#1}}", 1], // Similar to footnote but without the second argument
      LWRorighspace: "\\hspace",
      hspace: "\\ifstar \\LWRorighspace \\LWRorighspace", // This might not work as expected because \ifstar isn't directly supported in MathJax
      mathnormal: ["{#1}", 1],
      ensuremath: ["#1", 1],
      LWRframebox: ["\\fbox{#2}", 2, ""], // Default value for the first optional argument is handled as empty
      framebox: "\\LWRframebox", // This assumes LWRframebox is properly defined to handle its arguments
      setlength: ["{}", 2],
      addtolength: ["{}", 2],
      setcounter: ["{}", 2],
      addtocounter: ["{}", 2],
      arabic: ["{}", 1],
      number: ["{}", 1],
      noalign: ["\\text{#1}\\notag\\", 1],
      cline: ["{}", 1],
      directlua: "\\text{(directlua)}",
      luatexdirectlua: "\\text{(directlua)}",
      protect: "{}",
      // The commands related to \mathchar, \mathcode, \delcode, \delimiter might not be directly convertible
      oe: "\\unicode{x0153}",
      OE: "\\unicode{x0152}",
      ae: "\\unicode{x00E6}",
      AE: "\\unicode{x00C6}",
      aa: "\\unicode{x00E5}",
      AA: "\\unicode{x00C5}",
      o: "\\unicode{x00F8}",
      O: "\\unicode{x00D8}",
      l: "\\unicode{x0142}",
      L: "\\unicode{x0141}",
      ss: "\\unicode{x00DF}",
      SS: "\\unicode{x1E9E}",
      dag: "\\unicode{x2020}",
      ddag: "\\unicode{x2021}",
      P: "\\unicode{x00B6}",
      copyright: "\\unicode{x00A9}",
      pounds: "\\unicode{x00A3}",
      LWRref: "\\ref",
      ref: "\\ifstar \\LWRref\\LWRref", // Again, \ifstar might not be supported as expected
      multicolumn: ["#3", 3], // Assuming only the third argument is of interest
      // \require{textcomp} doesn't have a direct equivalent, but MathJax might already support the required symbols
      meta: ["\\langle \\textit{#1}\\rangle", 1],
      intertext: ["\\text{#1}\\notag\\", 1],
      Hat: "\\hat",
      Check: "\\check",
      Tilde: "\\tilde",
      Acute: "\\acute",
      Grave: "\\grave",
      Dot: "\\dot",
      Ddot: "\\ddot",
      Breve: "\\breve",
      Bar: "\\bar",
      Vec: "\\vec"
    }
  }
});
bdtc commented 7 months ago

It looks like there is a new tag "data-nosnippet" which can be added to a div to prevent the contents of the div from appearing in a snippet. I shall add this to the MathJax customization div. But it appears that Bing may not yet support this.

I shall also change the div to display:none instead of class="hidden"

<div class="hidden">

becomes

<div display:none data-nosnippet>

I shall leave this issue open until this change has been made and you see if it works.


One concern I have with converting everything to the MathJax script is that when MathJax changed from v2 to v3 a lot of their LaTeX extensions did not port over immediately. I haven't looked into whether this is likely to affect Lwarp customizations, but we sure don't want a situation where MathJax changes and it takes months to get the Lwarp customizations working again. There are more than 50 LaTeX packages with MathJax support.

In a few cases Lwarp is using the MathJax extensions which became available in MathJax v3, but I have the original Lwarp emulations as backup, stored in comments in the .dtx file, in case MathJax 4 breaks these extensions.

DominikPeters commented 7 months ago

Thanks, that's a good solution. I'll keep an eye out. \def \LWRfootnote {1}\) is an example search term that surfaces such pages. site:https://tikz.dev/pgfplots also has examples. I'll update the latter manually to use style="display:none" data-nosnippet to see what will happen.

image
bdtc commented 6 months ago

The tidy HTML checker complains about my proposal:

<div display:none data-nosnippet>

but is happy with your corrected improvement:

<div style="display:none;" data-nosnippet>

MathJax work as expected either way. I don't know if either will help with Bing. Let me know what you find.

Whether the style should be a CSS class or an explicit style probably won't matter. Either way, I shall add data-nosnippet.

bdtc commented 6 months ago

Now in v0.915.