mathjax / MathJax-node

MathJax for Node
Apache License 2.0
614 stars 96 forks source link

Typesetting and tex2jax question #416

Closed nick-xie closed 6 years ago

nick-xie commented 6 years ago

Hi, I am trying to use mathjax-node to implement server side typesetting of math notes. However, it seems that when passing in a string such as: "Hello, $\mathbb{N}$", the entire string is being typeset resulting in 'Hello' being italicized (as one would see with $Hello$) and the '$' signs remaining in the output.

After some troubleshooting, I came across a previous issue, #367, that sounds like the exact problem I'm having. From that thread, I gathered that this is the intended result and for my purposes, mathjax-node should only be used on the math text as designated by the delimiters and that another service such as mathjax-node-page should be used to parse and pull out the math text to be typeset.

I'm wondering then, what the purpose of the tex2jax property in the MathJax option in the config for mjAPI is if

"this kind of configuration of tex2jax will be ignored as the input is passed without delimiters anyway"

As well, I'm still a bit lost on exactly how to use mathjax-node-page with mathjax-node to achieve my desired result so any guidance on this would be greatly appreciated. Thanks!!

pkra commented 6 years ago

As well, I'm still a bit lost on exactly how to use mathjax-node-page with mathjax-node to achieve my desired result so any guidance on this would be greatly appreciated.

To reply to the actual question: use mathjax-node if you need its more involved, full-document processing. I'd suggest to check that you need its overhead - in my experience it's almost always more sensible to write a custom wrapper around mathjax-node.

I'm wondering then, what the purpose of the tex2jax property in the MathJax option in the config for mjAPI is

The confusion is very understandable and due to the history of MathJax and MathJax-node. MathJax (<3) was designed exclusively for client-side rendering (thus for full documents); its tex2jax pre-processor is needed to extract TeX strings (by delimiters) from general text flow (plus some tolerable HTML tags such as wbr).

MathJax-node came years later to enable server-side rendering using NodeJS (note: MathJax is actually older than NodeJS). Some of MathJax's APIs do not make sense in a NodeJS setting, especially where only a single expression is processed. Reversely, some of MathJax's work cannot be (efficiently) replicated in a NodeJS setting.

Initially (0.x), MathJax-node offered two APIs, one for individual expressions (mj-single) and one for full documents (mj-page). At 1.0.0, MathJax-node was split into several packages: MathJax-node focused on the central use case and other, community-owned modules could hndle other use cases. This way, developers could pull in the parts they need without minimal overhead and each library could develop individually, improving its APIs as needed by the community maintaining it.

I hope that sheds some light on it.

Note: MathJax v3 (currently in beta) changes everything again. It is a complete redesign and in particular works on NodeJS directly (making MathJax-node superfluous).

nick-xie commented 6 years ago

First of all, thanks so much for a such an informative and timely response, it's much appreciated!

write a custom wrapper around mathjax-node

I was wondering if you know of any resources/examples of such a wrapper built around mathjax-node. I've been trying to see how mathjax-node-page could be used but am still a bit lost.

Regarding MathJax v3, do you know approximately when to expect a stable release? And with that release, whether or not a wrapper that works for mathjax-node would work with v3? I'm currently working on a project that has a soft deadline in the fall so trying to figure out my best approach.

Once again, thanks so much for the response!

pkra commented 6 years ago

I was wondering if you know of any resources/examples of such a wrapper built around mathjax-node.

What I meant was that whenever there's some structure in the content (as is often the case when people consider server-side rendering), then you can use that structure instead of needing the full power of MathJax's tex2jax pre-processor. For example, if the TeX content is wrapped by elements fixed class names then something like the following could be sufficient.

const fs = require('fs');
const mjnode = require('mathjax-node-sre');
mjnode.start();
const mj = mjnode.typeset;
const jsdom = require("jsdom");
const { JSDOM } = jsdom;

const input = fs.readFileSync(process.argv[2]).toString();

const dom = new JSDOM(input);
const document = dom.window.document;
for (let math of document.querySelectorAll(".math--block .math--inline")){
    mj({
        math: math.innerHTML, // strip delimiters if needed
        format: math.classList.contains('math--block') ? 'TeX' : 'inline-TeX',
        html: true
    }, function(result){
        math.innerHTML = result.html;
    });
}
// add the stylesheet (usually handled separately)
mj({
    css: true
}, function(result){
    const styles = document.createElement('style');
    styles.textContent = result.css;
    document.head.appendChild(styles);

    fs.writeFile(process.argv[3], dom.serialize());
})
pkra commented 6 years ago

Regarding MathJax v3, do you know approximately when to expect a stable release?

The team would have to comment on that.

And with that release, whether or not a wrapper that works for mathjax-node would work with v3?

Extremely unlikely. But it will be easier with v3.

nick-xie commented 6 years ago

Thanks again for the response, it's extremely helpful. I'm going to try to work with this and see if I can solve my issue in my project and I'll let you know how it goes!

pkra commented 6 years ago

I just realized I forgot to mention: there's now https://github.com/mathjax/mj3-demos-node as well (though I haven't played around with it yet).

nick-xie commented 6 years ago

Update: Just got it all working! Thanks so much for all your help, I definitely wouldn't have been able to do it without the code you posted above. I ended up using the svg type instead of html since including the css proved to be a little tricky in my particular setting.

For anyone else curious, I used the above code to end up with a JSDOM object that represented my desired text and then I then returned its serialize() value (basically converts the document to a string). From there, my front end rendered the string itself. So an example input would be something like <p>This is <span class="math--inline">\\mathbb{N}</span> some math</p> There was some finnicky promise stuff since the typeset method is promise based and the for loop needs to finish before returning so I used the Q npm package and used its .all() method to resolve this. Just a note, there's also a typo in the above code, document.querySelectorAll(".math--block .math--inline") should be document.querySelectorAll(".math--block, .math--inline"). The comma is needed or else it will skip elements with only one of the listed classes so it will return null.

Thanks again for all your help @pkra!!

pkra commented 6 years ago

Thanks for sharing. Good to hear you could work something out.