vsch / flexmark-java

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
BSD 2-Clause "Simplified" License
2.29k stars 272 forks source link

Rendering Markdown nested within HTML tags #439

Open gdude2002 opened 3 years ago

gdude2002 commented 3 years ago

Flexmark Version: 0.62.2

I'm not certain whether this is a bug report or a feature request - I'd say, probably more of just a question.

My setup is a little convoluted - I'm using Flexmark along with the Pebble template engine in a custom static site generator, where Markdown files can also be Pebble templates. As there are many things I'd like writers to have access to, I'm leveraging Pebble to add those things instead of trying to extend the Markdown parser into something that barely resembles Markdown.

The issue I have at the moment is that I can't seem to figure out how to get Markdown nested within HTML tags to be parsed and rendered into HTML. My parser options look like this right now:

private var settings: DataSet = MutableDataSet()
    .set(HtmlRenderer.GENERATE_HEADER_ID, true)

    .set(Parser.HTML_BLOCK_DEEP_PARSER, true)
    .set(Parser.HTML_BLOCK_DEEP_PARSE_NON_BLOCK, true)
    .set(Parser.HTML_BLOCK_START_ONLY_ON_BLOCK_TAGS, false)
    .set(Parser.HTML_BLOCK_DEEP_PARSE_MARKDOWN_INTERRUPTS_CLOSED , true)

    .toImmutable()

I've tried a bunch of variations of the above, but I can't seem to figure it out.


For simplicity, we'll skip the templating and put the HTML directly into the Markdown document - let's take the following as an example:

Hello, this is the index page!

<article class="message is-info">
    <div class="message-header">
        <p>Info admonition title!</p>
    </div>
    <div class="message-body content">
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
    </div>
</article>

This becomes the following:

<p>Hello, this is the index page!</p>
<article class="message is-info">
    <div class="message-header">
        <p>Info admonition title!</p>
    </div>
    <div class="message-body content">
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
    </div>
</article>

Note: I am aware that an admonitions extension exists for Flexmark - this is just a simplified example of what I'm trying to achieve.

gdude2002 commented 3 years ago

I have discovered that placing a line of text before the list works. Sort of.

<article class="message is-info">
    <div class="message-header">
        <p>Info admonition title!</p>
    </div>
    <div class="message-body content">
Some text.

* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
    </div>
</article>
<article class="message is-info">
    <div class="message-header">
        <p>Info admonition title!</p>
    </div>
    <div class="message-body content">
Some text.
<ul>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.
  </div>
</li>
</ul>
</article>

You may notice that the HTML in this configuration is mismatched. Specifically, the closing </div> is placed within the final <li> element. What's up with that?

You may think that the issue is the list block, but placing some text after it just moves the </div> into another element.

<article class="message is-info">
    <div class="message-header">
        <p>Info admonition title!</p>
    </div>
    <div class="message-body content">
Some text.

* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.

Some text.
    </div>
</article>
</article>
<article class="message is-info">
    <div class="message-header">
        <p>Info admonition title!</p>
    </div>
    <div class="message-body content">
Some text.
<ul>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
</ul>
<p>Some text.
</div></p>
</article>
gdude2002 commented 3 years ago

The weird tag-wrapping can be avoided by not indenting the closing HTML tag, it turns out. This feels a touch janky, but it works.

<article class="message is-info">
<div class="message-header">
    <p>Info admonition title!</p>
</div>
<div class="message-body content">
Some text.

* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
* This is a Markdown list placed within a message.
</div>
</article>
<article class="message is-info">
<div class="message-header">
    <p>Info admonition title!</p>
</div>
<div class="message-body content">
Some text.
<ul>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
<li>This is a Markdown list placed within a message.</li>
</ul>
</div>
</article>

However, you still need to have something before the list for it to be rendered. A completely empty line works too.