logseq / mldoc

Another Emacs Org-mode and Markdown parser.
GNU Affero General Public License v3.0
247 stars 22 forks source link

Improve math parsing #118

Open xxchan opened 2 years ago

xxchan commented 2 years ago

current situation

types:

syntax:

  1. $content$, TeX delimiters for inline math -> Latex_Fragment Inline

  2. $$content$$, TeX delimiters for displayed math

    • $$ appears at the start of a line -> Displayed_Math
    • otherwise -> Latex_Fragment Display
      • cannot be multi-line
  3. \( content \) , LaTeX delimiters for inline math -> Latex_Fragment Inline

    • cannot be multi-line
  4. \[ content \], LaTeX delimiters for displayed math -> Latex_Fragment Display

    • cannot be multi-line
  5. \begin{env} content \end{env}, LaTeX environment-> Latex_Environment

My improvement ideas

i.e.,

$$a
b$$

and

$$a$$

are displayed math. But

$$a
$$b

and

$$a$$b

is equivalent to $a$b

For reference (and also an aternative), Typora's behavior:

xxchan commented 2 years ago

By the way, should them be added to Logseq's doc?...

garyo commented 2 years ago

I came across this because of seeing what I think is too-aggressive math parsing. In the US, we often need to write about finances, so:

- dinner was $41.56, and breakfast will be $12.00.

It would be great if logseq could somehow figure out that this should not be parsed as math -- maybe if the trailing (second) $ is immediately followed by digits? I know users can work around this by using \$ but that is not intuitive to me, and breaks the journaling "flow". What do you think?

xxchan commented 2 years ago

@garyo IIRC your example can be handled correctlu in logseq?

Because there is a space before the second $

garyo commented 2 years ago

Yes, you're right. I've gotten into the habit of prefixing all my $ with \ because it's hard to know exactly when it'll go into math mode, but if space before the "trailing" (every other) $ works, that is good to know! Here's a case that's unfortunately too close to math mode:

- 8 mo x $400 = $3200, vs $695+$850+$66x4x8=$3657

but if I add spaces, then it's OK (no math mode, the $ signs still show up):

8 mo x $400 = $3200, vs $695 + $850+ $66x4x8 = $3657

Was this improved some time recently? I find a lot of \$ in my md files that I can now remove and they look OK. So I guess this is less important now than I thought.

One case that could still be improved is if there is emphasis around the dollar figure:

result is $1000 or, over a longer time, **$5000**

That currently in logseq still switches into math mode.

xxchan commented 2 years ago

Was this improved some time recently?

This is the behavior since the very beginning, but I proposed to change it in this issue. Now I think this behavior makes some sense.

One case that could still be improved is if there is emphasis around the dollar figure

IMHO it's almost impossible to handle this case (or other similar complicated ones) correctly. Otherwise math mode won't work correctly. In theory, maybe we can add an config option to opt-out math mode (or maybe only double $$ for math mode? but that's a big breaking change), but that may bring some complexity and may have rather low priority for LogSeq. So in practice, I would suggest always add a space or a \ before $ in your use case :)