yzhang-gh / vscode-markdown

Markdown All in One
https://marketplace.visualstudio.com/items?itemName=yzhang.markdown-all-in-one
MIT License
2.93k stars 325 forks source link

LaTeX is not preserved in TOC text #903

Open ComFreek opened 3 years ago

ComFreek commented 3 years ago

Problem

TOC entries for headers containing inline LaTeX are wrongly generated:

1. [$\\mathrm{gcd}(a, b)$](#mathrmgcda-b)

# $\mathrm{gcd}(a, b)$

Here, it should have been 1. [$\mathrm{gcd}(a, b)$](#mathrmgcda-b). I am using Markdown All in One configured with "markdown.extension.math.enabled": false.

I am using the latest dev build as of 2021-02-17.

Lemmingh commented 3 years ago

This change is by design.

It would take a too large amount of effort to keep the result safe and not aggressively escape backslash at the same time.

https://github.com/yzhang-gh/vscode-markdown/blob/c3595155f1d15943bd71effa42d959dd0bdbd7f9/src/toc.ts#L41-L45 https://github.com/yzhang-gh/vscode-markdown/blob/c3595155f1d15943bd71effa42d959dd0bdbd7f9/src/toc.ts#L447-L467

yzhang-gh commented 3 years ago

Guess we need to add a patch for this. From my experience, many users will be affected by this.

Lemmingh commented 3 years ago

Rechecked.

Unsolvable

Lemmingh commented 3 years ago

I understand the need of displaying pretty math in headings and TOC, but I'm sorry,

The behavior cannot be changed anymore, unless we take the risk of putting broken links into TOC.


With #176, #194, #531, #540, #552, #570, #862, etc., we eventually lose control of how users embed math in Markdown, and have no reliable means of identifying math area.

We have to perform CommonMark only parsing when generating link text for TOC visible text.

yzhang-gh commented 3 years ago

I mean we need to be consistent.

By saying a patch, I mean, e.g., a regexp replacing/"protecting" the backslashes in $...$. This makes the solution not "neat", but otherwise we will see more issues complaining about this after v3.5.0 release.

BTW, do you have some examples where we must escape the \? I cannot think of some.

Lemmingh commented 3 years ago

The Even-odd Problem.

The backslash \ is the very escaping indicator.


https://github.com/yzhang-gh/vscode-markdown/blob/c3595155f1d15943bd71effa42d959dd0bdbd7f9/src/toc.ts#L467

Look at them. This is the minimal escaping set.

Their occurrences can change semantics, and eventually break the link.

They have to be escaped. My comments in code are clear enough.


A simple example:

If you're going to display

<a href="#uri">\[</a>

Then, you'll have to write

[\\\[](#uri)
Lemmingh commented 3 years ago

A math area is a special variant of code span/block, from the perspective of a parser.

Among solutions, I think GitLab is the wisest, and I use it in my daily work.

GitLab actually does not introduce new syntax, instead, it reuses the existing syntax in CommonMark. A math area on GitLab is exactly a code span/block.


On GitLab,

[[_TOC_]]

# $``[(a+b)!]^2``$

````math
\frac{1}{2}

gives you

````````html
<ul class="section-nav"><li><a href="#ab2">[(a+b)!]^2</a></li></ul>
<h1 data-sourcepos="3:1-3:18" dir="auto">
<a id="user-content-ab2" class="anchor" href="#ab2" aria-hidden="true"></a><span data-math-style="inline"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mo stretchy="false">[</mo><mo stretchy="false">(</mo><mi>a</mi><mo>+</mo><mi>b</mi><mo stretchy="false">)</mo><mo stretchy="false">!</mo><msup><mo stretchy="false">]</mo><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">[(a+b)!]^2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mopen">(</span><span class="mord mathdefault">a</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1.064108em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mclose">)</span><span class="mclose">!</span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141079999999999em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span></span>
</h1>
<span data-math-style="display"><span class="katex-display"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mfrac><mn>1</mn><mn>2</mn></mfrac></mrow><annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.00744em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.32144em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">2</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span></span></span>
Lemmingh commented 3 years ago

I prefer to at least leave $...$ untouched.

It is not possible.

# $[(a+b)!](c+d)^2$

With pandoc-syntax, this is a math area. Without, this is an inline link.

Lemmingh commented 3 years ago

https://github.com/yzhang-gh/vscode-markdown/blob/c3595155f1d15943bd71effa42d959dd0bdbd7f9/src/toc.ts#L41-L45

On my draft, it was

/**
 * The **single line plain text** representation of the rendering result (in CommonMark mode) of the heading.
 * This must be able to be safely put into a `[]` bracket pair as **link text** without breaking Markdown syntax.
 */
visibleText: string;

the same as GitLab's [[_TOC_]].

Later, I spent a hard time in createLinkText() to maintain backward compatibility. I think the current implementation is the best we can do, already the maximum backward compatibility.

yzhang-gh commented 3 years ago

I agree with you in theory (although I'm not very convinced by the \\\[ example which is a bit "unrealistic"...).

The problem is then we can only advocate the users to use the GitLab style math syntax? (I'm not against it but not sure whether other users will like it.)


Let's keep the change and leave this issue open to collect more feedback.

coin8086 commented 1 year ago

So the time is April 23, 2023 now. Do we have any workaround for this problem? I'm using Markdown All in One v3.5.1 in VS Code. I got generated TOC text

[ABC $\\mathbf{A} \\mathbf{x}$](#abc-mathbfa-mathbfx)

for

### ABC $\mathbf{A} \mathbf{x}$

The escaped text in TOC is ugly. I'd rather have them removed from TOC, like

[ABC](#abc)
yzhang-gh commented 1 year ago

any workaround

Unfortunately no.

I don't have sufficient time to change this. Might need some help from the community.

coin8086 commented 1 year ago

any workaround

Unfortunately no.

I don't have sufficient time to change this. Might need some help from the community.

Sorry to hear that. Thanks for your effort anyway!