Open ljrk0 opened 2 years ago
I don't think getting the content between the $
signs will always work, as it can also be server-side-rendered. Luckily it seems like both MathJax and Katex (also) support the <math>
tag.
So a math
plugin would need to support both methods:
it will typically have a $\lambda$-expression as argument.
<mjx-assistive-mml unselectable="on" display="inline">
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mi>λ</mi>
</math>
</mjx-assistive-mml>
I won't add this plugin anytime soon, as it would be a lot of work. But this plugin should exist! Ideally maintained by someone better in math than me 😅
I'm planning a v2 of the library. Maybe I will add it then...
You could already help by collecting various snippets from websites you encounter. This should cover a variety of uses (e.g. client-side-rendering, server-side-rendering, different libraries, content that looks like math but is NOT, ...)
See this file as an example. It follows this pattern:
<!-- https://example.com/page1 -->
<div>snippet 1</div>
<hr />
<!-- https://example.com/page1 -->
<p>snippet 2</p>
<hr />
...
Thanks for implementing #49 so quickly!
Yeah, MathJax supports LaTeX-Style, MathML as well as AsciiMath. Converting MathML to Markdown however is probably quite much work. Simply "passing through" dollar-signs if so-configured in the scripts may work "good enough" for most use cases though?
I've just noticed that pandoc can do just the thing:
pandoc --from=html+tex_math_dollars+tex_math_single_backslash+tex_math_double_backslash \
--to=markdown \
--output=foo.md \
input.html
You can also choose --to=html
to convert e.g., `$\lambda. \dots$ to:
<span class="math inline"><em>λ</em><em>i</em>.…</span>
Which works good enough for my use cases for now. Adding real $
support is quite tricky, especially when it comes to finding the closing tag etc.
Regardless, I will collect examples I stumble upon :)
Describe the bug MathJax is a JavaScript library allowing to add "custom tags" such as
$...$
to HTML which will then be turned into e.g., MathML or whatever the browser supports.Depending on the Markdown implementation math is either not supported at all -- or directly through the same syntax. Either way, it'd probably make most sense to simply keep
$...$
expressions intact and not escape strings contained therein. While a simple filter for that would certainly work, MathJax allows supporting different escape characters than$...$
for inline- and$$...$$
for display-math, e.g., from the article https://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/:This would necessate parsing Js though ...
HTML Input
Generated Markdown
Expected Markdown
Additional context This filter (or "unfilter") may be only activated, if MathJax is detected, and otherwise disabled. Further, as mentioned earlier, a more sophisticated parsing of the HTML may be used to detect the precise math-HTML tags used or make them configurable at the least.