RickStrahl / MarkdownMonster

An extensible Markdown Editor, Viewer and Weblog Publisher for Windows
https://markdownmonster.west-wind.com
Other
1.59k stars 235 forks source link

Feature Request: Support for `[[_TOC_]]` #1084

Closed jbridgy closed 7 months ago

jbridgy commented 8 months ago

GitLab, Azure DevOps and Typora support [[_TOC_]] to generate a table of contents dynamically. It would be nice if MarkdownMonster joined this league.

RickStrahl commented 8 months ago

Can you find a me a reference to this in the GitHub docs somewhere? I can't seem to dig anything up on this.

That said that should be an easy add - we can use a render extension and inject the TOC using the logic that already exists to generate a TOC (from the Bookmarks sidebar). The main issue I have with this is that it's going to slow down document rendering slightly as the TOC would have to be regenerated all the time.

jbridgy commented 8 months ago

Here is the reference to the TOC feature of GitLab Flavored Markdown (GLFM). GitHub does not support [[_TOC_]].

RickStrahl commented 8 months ago

Ok, I've added this feature:

image

TOC gets rendered into the document on every re-render so it's dynamic and updates with the document changes.

The document outline now also has a new button that inserts this tag into the page in addition to the existing behavior of embedding a static TOC.

Important to understand though that this only work if you're using MM to preview or render HTML, or if the destination service that renders the Markdown supports this tag. Otherwise the tag ends up getting displayed in the document (ie. on GitHub which doesn't support it) .

jbridgy commented 8 months ago

Looks good! Is TOC also rendered in the following cases?

  1. Document is exported to a HTML file using the image button.
  2. Document is exported to a PDF file using the image button. In this case MM already offers the option "Generate Table of Contents" (which actually means "Generate Bookmarks"). So rendering the TOC (hyperlinked) in addition to or instead of the bookmarks may be another option.
  3. Document is printed (I guess after conversion to HTML).
RickStrahl commented 8 months ago

It'll work with anything that generates full HTML output which includes PDF and print output which just 'prints' the html content.

jbridgy commented 8 months ago

Thank you for the prompt implementation! It works as I expected except for headings containing inline code with a single \. EXAMPLE:

### Directories in `SomeProjectRoot\`

MM v3.2.8 generates the following invalid TOC entry for the heading above:

[Directories in SomeProjectRoot\](#directories-in-someprojectroot)

\] makes the link invalid.

Expected TOC entry:

[Directories in SomeProjectRoot\\](#directories-in-someprojectroot)

The rendering is correct in MM's Document Outline, GitLab, Azure DevOps and Typora.

BTW: It would be nice if MM's Document Outline allowed headings to be collapsed and expanded.

jbridgy commented 7 months ago

I encountered three other issues with the [[_TOC_]] tag using MM v3.2.9.2:

  1. The tag is not expanded correctly if the heading structure is not normalized, that is, if there is a heading that is more than one level lower than its context (parent).
  2. The headings on levels 5 and 6 are ignored, although the CommonMark Spec, GFM and GLFM allow six levels.
  3. The tag is also expanded if it occurs within code which admittedly is a rare case.

EXAMPLES: The following heading structure is normalized and the TOCs (static and dynamic) are generated as expected apart from the missing H5 and H6 headings:

image

The following heading structure is not normalized because it starts with a heading of level 2 (instead of 1) and because a heading of level 4 occurs in a context of level 2 (instead of 3):

image

The missing heading of level 3 may be considered as mistake that the document author should fix anyway. However the missing heading of level 1 is often intentional. Usually I skip level 1 in short documents to avoid the disproportion between huge headings and little contents.

There is also a rare issue when you would like to write something about the [[_TOC_]] tag as I do in this sentence.

image

jbridgy commented 7 months ago

I encountered another issue with the TOC tag using MM v3.2.9.3.

The tag is not expanded correctly if the heading contains pairs of \++, +\+ or \+\+. MM renders text between pairs of ++ with underline, like text between <u> and </u>. To prevent the headings from being rendered wrongly at least one + of each ++ needs to be escaped. However, the TOC uses the unescaped headings.

EXAMPLE:

# Features of C\++17, C+\+20 and C\+\+23

MM generates the following invalid TOC entry for the heading above:

[Features of C++17, C++20 and C++23](#features-of-c17-c20-and-c23)

The rendered TOC entry reads "Features of C17, C20 and C++23", where "17, C" is underlined.

Expected TOC entry:

[Features of C\++17, C\++20 and C\++23](#features-of-c17-c20-and-c23)
RickStrahl commented 7 months ago

Thanks...

Fixed.

image

The fix was a bit tricky - reason it didn't work is because I was using HTML document to parse (reusing what's used for the Doc outline). That doesn't work though because we need to make sure we can retrieve the raw Markdown including the \ escape character.

So code now uses the Markdown parser to parse the Markdown document to get headers and then explicitly checks each subblock for escape characters which are injected into the generated output.

This fixes the [[_TOC_]] as well as the manually injected Toc from the bookmarks sidebar.

RickStrahl commented 7 months ago

One thing that I noticed and forgot about:

The manual TOC generates a TOC that shows only headings that occur after the injected TOC, while the new [[_TOC_]] command injects everything....

jbridgy commented 7 months ago

Does the new implementation also fix the following issue with inline code ?

image

The static TOC is correct while the dynamic TOC loses the inline code formatting and even generates rubbish when the inline code ends with \. See also old comment one same issue and updated comment on related issues.

RickStrahl commented 7 months ago

Yeah looks like it:

image

I think the inlines were fixed for the document outline (which now uses the same mechanism).

The new code uses the Markdown document to find headers and then based on the type of text fixes it up.

There may still be edge cases I would think where this doesn't work. Markdown is not a perfect 2-way markup language - things can easily get lost in the back and forth. But I think most of the more common scenarios are addressed.

RickStrahl commented 7 months ago

Ok I was too quick on that - no that doesn't work if the code inside of the inline is 'escaped' text like a path in your example.

The only way this works if the inline code is kept intact. Slashes would be easy to fix, but there are million things that could be inside of the an inline that don't render right as markdown. Like HTML markup, html entities etc.

So now I render the inline as an inline into the TOC:

image

I don't really like the way that looks but there's really no reliable way to make that work without the inlines. Slashes are only one thing - it could be many, many other escaped characters, but the parser doesn't know about that because the text is an inline code block that's treated as well an linline code block.

The only way to produce reliable output here is to bring in any Markdown from the header text into the TOC or there could be rendering issues (like your C++ or underscores, or HTML entities or slasshes etc.).

For the Document outline this is simpler because there's no escaping happening there, so using the plain text in the actual outline works (and that's what it's been doing).