jupyter-book / mystmd

Command line tools for working with MyST Markdown.
https://mystmd.org/guide
MIT License
210 stars 61 forks source link

`mathPlugin` adds `.html` attribute to node, but seems to be ignored in `mystToHast` #859

Open sglyon opened 9 months ago

sglyon commented 9 months ago

Description

Hi team!

I'm working on integrating myst into an existing web application using myst-parser, myst-transforms, and myst-to-html

I eventually want to use a pipeline like this

    const pipeline = unified()
      .use(mystParser)
      .use(mathPlugin)
      .use(transform, new State())
      .use(mystToHast)
      .use(formatHtml)
      .use(rehypeStringify);

However, when I do that any inline math or display math that I use ends up not being KaTeXified in the resulting html.

Here are some screenshots that demonstrate this with the sandbox here: https://mystmd.org/sandbox

First the source and DEMO tab of the sandbox. Notice the $\LaTeX in the source and that it is rendered properly on the right:

Screenshot 2024-01-18 at 13 29 18

Then let's look at the AST before plugins (AST -> pre tab). Here we have an inlineMath node with value: \LaTeX

Screenshot 2024-01-18 at 13 28 58

Then if we look at the AST after plugins (AST -> post tab) we have a new html attribute on that node. This html attribute contains the katex output and is what I want in my output:

Screenshot 2024-01-18 at 13 29 04

Finally, if we look at the HTML tab we see that we don't get that nice katex output, but rather a bland <span class="math inline>...</span> element

Screenshot 2024-01-18 at 13 28 46

I've done some digging and can see that the .html attribute is added in teh mathPlugin from myst-transforms here: https://github.com/executablebooks/mystmd/blob/2be01f12907dacc37c326949c2f2e7b8e95a1701/packages/myst-transforms/src/math.ts#L191

I also looked into myst-to-html and found that it is (1) not checking for the .html attribute on any of the nodes and (2) responsible for adding the <span class="math inilne>...</span> element here: https://github.com/executablebooks/mystmd/blob/2be01f12907dacc37c326949c2f2e7b8e95a1701/packages/myst-to-html/src/renderer.ts#L14-L29

In my digging I ran the following in a node console session:

> txt = "This is an equation: \$V(a) = \\max_{a'} u(a' - r a) + \\beta E[V(a')]\$"
"This is an equation: $V(a) = \\max_{a'} u(a' - r a) + \\beta E[V(a')]$"
> mystast = unified().use(mystParser).parse(txt)
{
  type: 'root',
  children: [ { type: 'paragraph', position: [Object], children: [Array] } ]
}
> mystast.children[0].children[1]
{
  type: 'inlineMath',
  value: "V(a) = \\max_{a'} u(a' - r a) + \\beta E[V(a')]",
  position: { start: { line: 1, column: 1 }, end: { line: 1, column: 1 } }
}
> mystast2 = unified().use(mathPlugin).runSync(mystast)
{
  type: 'root',
  children: [ { type: 'paragraph', position: [Object], children: [Array] } ]
}
> mystast2.children[0].children[1]
{
  type: 'inlineMath',
  value: "V(a) = \\max_{a'} u(a' - r a) + \\beta E[V(a')]",
  position: { start: { line: 1, column: 1 }, end: { line: 1, column: 1 } },
  html: '<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>V</mi><mo stretchy="false">(</mo><mi>a</mi><mo stretchy="false">)</mo><mo>=</mo><msub><mrow><mi>max</mi><mo>⁡</mo></mrow><msup><mi>a</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup></msub><mi>u</mi><mo stretchy="false">(</mo><msup><mi>a</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup><mo>−</mo><mi>r</mi><mi>a</mi><mo stretchy="false">)</mo><mo>+</mo><mi>β</mi><mi>E</mi><mo stretchy="false">[</mo><mi>V</mi><mo stretchy="false">(</mo><msup><mi>a</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup><mo stretchy="false">)</mo><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">V(a) = \\max_{a&#x27;} u(a&#x27; - r a) + \\beta E[V(a&#x27;)]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.22222em;">V</span><span class="mopen">(</span><span class="mord mathnormal">a</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0019em;vertical-align:-0.25em;"></span><span class="mop"><span class="mop">max</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.328em;"><span style="top:-2.55em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">a</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6828em;"><span style="top:-2.786em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">u</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7519em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">a</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0019em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">βE</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.22222em;">V</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7519em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span><span class="mclose">)]</span></span></span></span>'
}
> hast = unified().use(mystToHast).runSync(mystast2)
{
  type: 'root',
  children: [
    {
      type: 'element',
      tagName: 'p',
      properties: {},
      children: [Array],
      position: [Object]
    }
  ]
}
> hast.children[0].children[1]
{
  type: 'element',
  tagName: 'span',
  properties: { class: 'math inline' },
  children: [
    {
      type: 'text',
      value: "V(a) = \\max_{a'} u(a' - r a) + \\beta E[V(a')]"
    }
  ],
  position: {
    start: { line: 1, column: 1, offset: null },
    end: { line: 1, column: 1, offset: null }
  }
}

Note that when we first get mystast we have an inlineMath node. Then after running mathPlugin we get the .html method, but then after running mystToHast we lost that info and are left with the single span

welcome[bot] commented 9 months ago

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

sglyon commented 9 months ago

I think that the issue might be fixed by modifying the math and inlineMath handlers here: https://github.com/executablebooks/mystmd/blob/2be01f12907dacc37c326949c2f2e7b8e95a1701/packages/myst-to-html/src/schema.ts#L58-L71

I'm thinking we could modify them to check for the presence of an html attribute on node. I'm not sure how to cram raw html into the ast though...

agoose77 commented 9 months ago

The rest of the MyST team have a better idea of what should happen here w.r.t myst-to-html.

But, you can also choose to render the math client-side, e.g. https://github.com/executablebooks/mystmd/issues/824

sglyon commented 9 months ago

Wow fast response here!

Thanks for the example. I am running everything client side in this project.

I think the issue illustrated above highlights problems with either how the mathPlugin from myst-transforms works (by adding an .html attribute to the node) or how myst-to-html works (by ignoring the presence of the .html attribute)

I'd love to see if we can get that resolved so we can have a smooth experience using the plugins/transforms.

agoose77 commented 9 months ago

@sglyon my point is rather that the mathPlugin itself just performs server-side rendering of the math. Obviously, in your case, "server" and "client" are the same thing, so the distinction is less obvious! :smile:

I'm deliberately not speaking to whether the HTML generated by the math transform should end up in the HTML rendered by myst-to-html; that's something that @rowanc1 can perhaps answer. But, you could also choose to create your own simple plugin that just engages KaTeX on the frontend rather than using SSR, e.g. https://github.com/executablebooks/mystmd/issues/824#issuecomment-1883058035

sglyon commented 9 months ago

Haha yes, I see what you are saying. I'm not using react or next or anything like that. I'm in the old school (but becoming new again?? ;)) world of server-based web apps with bits of js in the client to provide interactive features. So, my use of anything in the myst ecosystem is running entirely in the browser. Because of this I've already invoked katex in the browser (by running the mathPlugin), but that work is being thrown away.

Perhaps @rowanc1 could help validate if the intent behind the mathPlugin is to enable/handle rendering math in any of the supported outputs (e.g. html, typst, jats, etc.) or if users need to support math rendering on their own as in your example (which is very nice btw)

I'm guessing/hoping it is the former and that there is just a disconnect in how this is currently happening between mathPlugin and myst-to-html.

If users are expected to handle this on their own I'll definitely be taking your example as inspiration!

rowanc1 commented 9 months ago

Hi @sglyon - thanks for the question, sorry for the mismatch between these two plugins.

The intention of this when I was working through the basic HTML rendering was to leave the latex directly in a math node, rather than convert it to HTML there directly, this is similar to how MathJax and KaTeX generally interact with the DOM, and then they would get triggered after the render.

parse --> stringify -> trigger mathjax/katex as "normal"

The last step would be something similar to:

function onRender() {
  const elements = document.querySelectorAll('math');
  for (element in elements) {
    // check if it is display mode or not on the `inline` class on the element, and also pass that into the next render function
    // using katex or mathjax
    katex.render(element.textContent, element);
  }
}

This seemed to be easier to integrate with plain html when I was thinking through it, but that might not be the case given the questions we have had around it recently. This is also how JupyterLab works.

On the server side implementation, we are pre-rendering the math because the client is very light, and doesn't have a katex/mathjax javascript dependency, so we just serve up the HTML directly and integrate the stylesheet.

Let us know if the potential solution above makes sense and/or works for you, if that is the case, then you wouldn't have to run the mathPlugin transform at all. Otherwise we can talk through how to change it -- maybe at the least renaming or adding documentation to that mathPlugin to say this..!

sglyon commented 9 months ago

OK thanks for clarifying @rowanc1

Would it be possible to adjust the css class names that get applied to math nodes inside myst-to-html to match the ones used in remark-math?

Right now myst-to-html is using math block for display math and math inline for inline math.

remark-math expects math-display for display math and math-inline for inline math. (ref https://github.com/remarkjs/remark-math/blob/e99b9d088709d743adf6a43551fd416d7e0014ed/packages/rehype-katex/lib/index.js)

If we make this change then we can use .use(mystToHast).use(rehypeKatex) to get math rendered entirely within the unified pipeline.

rowanc1 commented 9 months ago

That sounds like a really good change to make, thanks for looking into that. Happy to take a pr on this if you are game!