jgm / djot

A light markup language
https://djot.net
MIT License
1.63k stars 43 forks source link

Math using $ $ delimiter? #160

Open chaoxu opened 1 year ago

chaoxu commented 1 year ago

Is it possible to support $ $ $$ $$ \( \), \[ \] just like pandoc markdown? As an extension of some sort?

I believe linear time parsing is still possible even if they are enabled. I just sampled one of my documents, there are 629 equations. More than 2% of the characters is $. I think this is true for many mathematical documents: math appears more often than every other construction (*, _, etc) combined! So optimize for math input is crucial for my use case.

I do switch between writing markdown and latex a lot, so able to use $ $ in djot would avoid having two sets of muscle memory.

jgm commented 1 year ago

As you probably know, djot supports math, but in a more principled way:

$`x=y^2`

Why not just $x=y^2$? Well, it turns out to be terribly difficult to get this right without writing a complete TeX tokenizer. For example, TeX math can contain \text{..}, which can itself contain $-delimited TeX math, e.g. $x = \text{my $y$}$. Last time I checked, this broke GitHub's implementation of the feature. Maybe they've fixed it; let's try again: $x = \text{my $y$}$. [Edit: guess not!]

jgm commented 1 year ago

As for \(, \[, those have clearly defined meanings as escaped ( and [, so we can't use those for math.

chaoxu commented 1 year ago

As you probably know, djot supports math, but in a more principled way:

$`x=y^2`

Why not just $x=y^2$? Well, it turns out to be terribly difficult to get this right without writing a complete TeX tokenizer. For example, TeX math can contain \text{..}, which can itself contain $-delimited TeX math, e.g. $x = \text{my $y$}$. Last time I checked, this broke GitHub's implementation of the feature. Maybe they've fixed it; let's try again: y}$. [Edit: guess not!]

Oh I see, indeed a problem I never thought of. How is pandoc handling it, having a complete TeX tokenizer?

Maybe just do escape \$ if it is in the middle?

Since even for $`math`, we have to escape ` when ` appears.

jgm commented 1 year ago

How is pandoc handling it, having a complete TeX tokenizer?

I think we skip content between matched {} or something, I can't remember now, and I don't think our current method is 100% accurate either.

Since even for $math , we have to escape when appears.

Well no. It works like regular verbatim backticks. There are no backslash escapes. Instead, if your content contains a `, you can use $`` ... `` as delimiters.

andersk commented 1 year ago

Having spent quite a while thinking about how to integrate math into CommonMark because some Zulip users really want to use $$, I came to the same conclusion as @jgm: the only reasonable syntaxes that can be parsed unambiguously are $`` (like Djot) and $``$ (like GitLab).

I wonder, though, if there might be room for a hybrid rule where $`` always works and $$ only works for “easy cases”—by loose analogy to Djot’s unambiguous {__} vs. restricted __ for emphasis? Maybe it suffices most of the time to have $$ follow the same precedence rules and whitespace restrictions as __, and when it doesn’t, the user can add the backticks for disambiguation.

tmke8 commented 1 year ago

Somewhat related: shouldn't display-mode math be written with block syntax? So

$```
\int_0^\infty e^{-x}\mathrm{d}x=1
```

instead of $$`...`?

After all, if you write this in latex:

This is some text. \[\int_0^\infty e^{-x}\mathrm{d}x=1\]. Here the text continues.

then it gets displayed as a block: image

So, we may as well use block syntax. Or is there some use case where you wouldn't want to write display-mode math as a block?

jgm commented 1 year ago

I think a case could be made for using the block syntax, but this would make display math a block element instead of an inline element. In LaTeX display math is inline, in the sense that it can come in the middle of a paragraph -- though of course it displays set off in a block. You do sometimes see things like

In the equation $$e=mc^2$$ Einstein showed...

and this will render fine in LaTeX.

Conaclos commented 1 year ago

I pretty like the idea of @andersk about restricted mode.

I could add two things:

```math
x^n + y^n = z^z
jgm commented 1 year ago

Compatibility with existing markdown variants isn't a goal of this project.

bpj commented 1 year ago

Compatibility with existing markdown variants isn't a goal of this project.

But compatibility with LaTeX display math is? I'd say if it displays as a block then it is a block.

bpj commented 1 year ago

Compatibility with existing markdown variants isn't a goal of this project.

But compatibility with LaTeX display math is? I'd say if it displays as a block then it is a block.

If it looks like a block in djot that is, never mind what crazy embedding stuff you can do in TeX, in case that wasn't clear.

jgm commented 1 year ago

@bpj I don't understand what you are getting at here. It is currently parsed as a block -- a code block. That's not at issue. At issue is whether it should be treated as a paragraph containing math rather than a code block.

Conaclos commented 1 year ago

I think that @bpj means that if it is a code block, then the syntax of a code block should be used?

bpj commented 1 year ago

I think that @bpj means that if it is a code block, then the syntax of a code block should be used?

Exactly! Sorry for the confusion!