JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.43k stars 5.45k forks source link

Markdown: $$ for block math needs to be on the same line #37334

Open fonsp opened 4 years ago

fonsp commented 4 years ago

Using $$ (double dollar sign) for block LaTeX only works if the equation starts on the same line.

Other Markdown interpreters, notably Jupyter (demo) and pandoc (demo), allow the equation to start on the next line.

image

fredrikekre commented 4 years ago

Just FYI, it tends to behave better with ```math instead

julia> md"""
       Hello
       ```math
       no math?
   """
fredrikekre commented 4 years ago

This seems to be the same as CommonMark.jl implements: https://github.com/MichaelHatherly/CommonMark.jl#math

MichaelHatherly commented 4 years ago

This seems to be the same as CommonMark.jl implements: https://github.com/MichaelHatherly/CommonMark.jl#math

Yeah, I based CommonMark's on Markdown's, hence the similarities. My feeling is that $ math should just be used where compatibility with external programs, such as Jupyter, is required. Backtick math tends to fail more gracefully when a parser doesn't support maths, as apposed to $ math.

fonsp commented 4 years ago

Not sure I follow, but I found that $$ (or math in general) is not part of the CommonMark spec, but it looks like CommonMark ignores single line breaks, so

Hello
world

$$
\sqrt{1}
$$

(link on pandoc)

is the same as

Hello world

$$\sqrt{1}$$

(link on pandoc)

which means that MathJax (a popular LaTeX-inside-html renderer) will recognize it as a block equation.

So depending on your strictness, CommonMark does not support $$ because it does not support math, or you could say that it does support it, if the HTML output is passed through MathJax.

MichaelHatherly commented 4 years ago

Not sure I follow, but I found that $$ (or math in general) is not part of the CommonMark spec, but it looks like CommonMark ignores single line breaks, so

Yeah, that's right. The spec does not dictate math syntax. Parsers can implement whatever they want, but if they truly support writing math, they should be outputting a representation that includes markup on the math, rather than just giving back $-wrapped text:

julia> using CommonMark, Markdown

julia> single_line = raw"$$\sqrt{1}$$";

julia> multi_line =
       raw"""
       $$
       \sqrt{1}
       $$
       """;

julia> no_maths = Parser();

julia> with_maths = enable!(Parser(), DollarMathRule());

When the parser doesn't handle $s, then it should pass through with no effect (unless the dollar-wrapped text contains parseable markdown):

julia> html(stdout, no_maths(single_line)); html(stdout, no_maths(multi_line));
<p>$$\sqrt{1}$$</p>
<p>$$
\sqrt{1}
$$</p>

When is does handle maths, it should really be giving it actual meaning in the output, rather than just printing out more $s:

julia> html(stdout, with_maths(single_line))
<div class="display-math">\[\sqrt{1}\]</div>
julia> html(stdout, with_maths(multi_line))
<div class="display-math">\[\sqrt{1}\]</div>

(Turns out I did implement multiline syntax, totally forgot that.) You'll find that should match pretty closely to the output you should get with something like pandoc.

The current parser, ignoring the escapes, doesn't give the output any meaning, just returning $ wrapped text:

julia> Markdown.html(stdout, Markdown.parse(single_line))
&#36;\sqrt&#123;1&#125;&#36;

julia> Markdown.html(stdout, Markdown.parse(multi_line))
:&#36;
<p>\sqrt&#123;1&#125; &#36;</p>

When I said

My feeling is that $ math should just be used where compatibility with external programs, such as Jupyter, is required. Backtick math tends to fail more gracefully when a parser doesn't support maths, as apposed to $ math.

What I'm meaning is that if a Julia program is reading in a Jupyter notebook, it should be able to parse $ syntax so that the notebook can still be read by other programs correctly. But if it's just markdown that's only ever going to be read by Julia itself, then backtick syntax should be used since it doesn't conflict with interpolation syntax in strings, and also has the property that backtick math tends to still look alright if it's parsed as normal "code".

fonsp commented 4 years ago

Thanks, this is helpful!

One thing is that Markdown does give meaning to $ as a LaTeX equation:

julia> Markdown.parse(single_line) |> dump
Markdown.MD
  content: Array{Any}((1,))
    1: Markdown.LaTeX
      formula: String "\\sqrt{1}"
  ...

but the default Markdown.html methods turn it into (HTML escaped) $formula$:

julia> Markdown.html(stdout, Markdown.LaTeX("formula"))
&#36;formula&#36;

probably with MathJax in mind.

fonsp commented 4 years ago

Side note: I found that $ and $$ are synonymous, the difference between inline and block math is determined by whether the Markdown.LaTeX is inside a Markdown.Paragraph or not, resp.

julia> md"$hello$" == md"$$hello$$"
true

julia> md"hello $world$" == md"hello $$world$$"
true
MichaelHatherly commented 4 years ago

One thing is that Markdown does give meaning to $ as a LaTeX equation:

Yeah, it's all good internally, just gets unnecessarily lost in the HTML output.

probably with MathJax in mind.

Correct.

Side note: I found that $ and $$ are synonymous

So pandoc gives this for

$hello$

$$hello$$

hello $world$.

hello $$world$$.
<p><span class="math inline">\(hello\)</span></p>
<p><span class="math display">\[hello\]</span></p>
<p>hello <span class="math inline">\(world\)</span>.</p>
<p>hello <span class="math display">\[world\]</span>.</p>

since $hello$ !== $$hello$$.

I'd argue that pandoc's insertion of display maths inside a paragraph is a bit odd and also that wrapping a paragraph around the display math is probably not what was should happen. CommonMark.jl gives

julia> html(stdout, with_maths(raw"""
       $hello$

       $$hello$$

       hello $world$.

       hello $$world$$.
       """))
<p><span class="math">\(hello\)</span></p>
<div class="display-math">\[hello\]</div>
<p>hello <span class="math">\(world\)</span>.</p>
<p>hello $$world$$.</p>
mortenpi commented 3 years ago

I found that $ and $$ are synonymous

This leads to potentially unintuitive behavior in lists (https://github.com/JuliaDocs/Documenter.jl/issues/1483), where you end up with display equations even though you probably want inline equations (e.g. 1 and 2 look like inline, but get interpreted as a display equations):

julia> md"""
       1. $x^2$
       2. $x^2$ X
       3. X $x^2$
       4. ``x^2``
       5. ```math
          x^2
   """.content[1].items

5-element Array{Any,1}: Any[$x^2$] Any[$x^2$, Markdown.Paragraph(Any["X"])] Any[Markdown.Paragraph(Any["X ", $x^2$])] Any[Markdown.Paragraph(Any[$x^2$])] Any[$x^2$]

torfjelde commented 3 years ago

Just to add to this discussion: a result of all this is that Markdown will now output a markdown format it cannot parse itself, e.g.

julia> s = raw"""
       ```math
       f(x) = x^2
   """

"math\nf(x) = x^2\n\n"

julia> Markdown.plain(Markdown.parse(s)) "\$\$\nf(x) = x^2\n\$\$\n"

julia> Markdown.parse(Markdown.plain(Markdown.parse(s))) :$

f(x) = x^2 :$

Even if you change the parser to use, say, `:github`, it still doesn't preserve the format:

```julia
julia> s = raw"""
       ```math
       \begin{equation}
       f(x) = x^2
       \end{equation}
   """

"math\n\\begin{equation}\nf(x) = x^2\n\\end{equation}\n\n"

julia> Markdown.parse(Markdown.plain(Markdown.parse(s; flavor = :github)); flavor = :github) $$ \begin{equation} f(x) = x^2 \end{equation} $$


This seems a bit strange to me. Is it intentional?
MichaelHatherly commented 3 years ago

From what I recall we never really intended Markdown.plain to produce a "roundtripable" output that could be fed back into the parser even though in many simple cases it does an alright job of it. (The CommonMark.markdown output on the other hand does roundtrip without loss of document structure, and is intended to remain that way.)