xoofx / markdig

A fast, powerful, CommonMark compliant, extensible Markdown processor for .NET
BSD 2-Clause "Simplified" License
4.38k stars 453 forks source link

Bug? Inline HTML is converted to wrong result #423

Open Dixin opened 4 years ago

Dixin commented 4 years ago

I work with a complicated environment, where we need to render a lot of documents with MD and HTML mixed. Some HTML <pre> blocks are converted to wrong result.

Problem

Take the following document as example:

## MD Heading

MD Paragraph

<p>HTML Paragraph</p>
<pre class="code"><span style="color: blue;">public partial class </span><span style="color: rgb(43, 145, 175);">AdventureWorks </span><span style="color: black;">: </span><span style="color: rgb(43, 145, 175);">DbContext
</span><span style="color: black;">{
    </span><span style="color: blue;">protected override void </span><span style="color: black;">OnModelCreating(</span><span style="color: rgb(43, 145, 175);">DbModelBuilder </span><span style="color: black;">modelBuilder)
    {
        </span><span style="color: blue;">base</span><span style="color: black;">.OnModelCreating(modelBuilder);

        </span><span style="color: green;">// Add functions on AdventureWorks to entity model.
        </span><span style="color: black;">modelBuilder.Conventions.Add(</span><span style="color: blue;">new </span><span style="color: rgb(43, 145, 175);">FunctionConvention</span><span style="color: black;">&lt;</span><span style="color: rgb(43, 145, 175);">AdventureWorks</span><span style="color: black;">&gt;());

        </span><span style="color: green;">// Add all complex types used by functions.
        </span><span style="color: black;">modelBuilder.ComplexType&lt;</span><span style="color: rgb(43, 145, 175);">ContactInformation</span><span style="color: black;">&gt;();
        modelBuilder.ComplexType&lt;</span><span style="color: rgb(43, 145, 175);">ManagerEmployee</span><span style="color: black;">&gt;();
        </span><span style="color: green;">// ...
    </span><span style="color: black;">}
}</span></pre>

it includes <pre> block, which is correctly rendered as:

image

(See: https://jsfiddle.net/dixin/0dj2b81x/)

Then I tried to process it with MarkDig:

static void Main()
{
    MarkdownPipelineBuilder builder = new MarkdownPipelineBuilder().UseAdvancedExtensions();
    MarkdownPipeline pipeline = builder.Build();
    string html = Markdown.ToHtml(File.ReadAllText(@"d:\md.txt"), pipeline);
    File.WriteAllText(@"d:\html.txt", html);
}

The HTML result is messed up:

<h2 id="md-heading">MD Heading</h2>
<p>MD Paragraph</p>
<p>HTML Paragraph</p>
<pre class="code"><span style="color: blue;">public partial class </span><span style="color: rgb(43, 145, 175);">AdventureWorks </span><span style="color: black;">: </span><span style="color: rgb(43, 145, 175);">DbContext
</span><span style="color: black;">{
    </span><span style="color: blue;">protected override void </span><span style="color: black;">OnModelCreating(</span><span style="color: rgb(43, 145, 175);">DbModelBuilder </span><span style="color: black;">modelBuilder)
    {
        </span><span style="color: blue;">base</span><span style="color: black;">.OnModelCreating(modelBuilder);
<pre><code>    &lt;/span&gt;&lt;span style=&quot;color: green;&quot;&gt;// Add functions on AdventureWorks to entity model.
    &lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;modelBuilder.Conventions.Add(&lt;/span&gt;&lt;span style=&quot;color: blue;&quot;&gt;new &lt;/span&gt;&lt;span style=&quot;color: rgb(43, 145, 175);&quot;&gt;FunctionConvention&lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&quot;color: rgb(43, 145, 175);&quot;&gt;AdventureWorks&lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;&amp;gt;());

    &lt;/span&gt;&lt;span style=&quot;color: green;&quot;&gt;// Add all complex types used by functions.
    &lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;modelBuilder.ComplexType&amp;lt;&lt;/span&gt;&lt;span style=&quot;color: rgb(43, 145, 175);&quot;&gt;ContactInformation&lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;&amp;gt;();
    modelBuilder.ComplexType&amp;lt;&lt;/span&gt;&lt;span style=&quot;color: rgb(43, 145, 175);&quot;&gt;ManagerEmployee&lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;&amp;gt;();
    &lt;/span&gt;&lt;span style=&quot;color: green;&quot;&gt;// ...
&lt;/span&gt;&lt;span style=&quot;color: black;&quot;&gt;}
</code></pre>
<p>}</span></pre></p>

So the document becomes not readable:

image

(See https://jsfiddle.net/dixin/6j7yLx85/)

Partial solution

I found #348, and used its code:

builder.BlockParsers.TryRemove<IndentedCodeBlockParser>();

Now the HTML result gets better:

<h2 id="md-heading">MD Heading</h2>
<p>MD Paragraph</p>
<p>HTML Paragraph</p>
<pre class="code"><span style="color: blue;">public partial class </span><span style="color: rgb(43, 145, 175);">AdventureWorks </span><span style="color: black;">: </span><span style="color: rgb(43, 145, 175);">DbContext
</span><span style="color: black;">{
    </span><span style="color: blue;">protected override void </span><span style="color: black;">OnModelCreating(</span><span style="color: rgb(43, 145, 175);">DbModelBuilder </span><span style="color: black;">modelBuilder)
    {
        </span><span style="color: blue;">base</span><span style="color: black;">.OnModelCreating(modelBuilder);
<p></span><span style="color: green;">// Add functions on AdventureWorks to entity model.
</span><span style="color: black;">modelBuilder.Conventions.Add(</span><span style="color: blue;">new </span><span style="color: rgb(43, 145, 175);">FunctionConvention</span><span style="color: black;">&lt;</span><span style="color: rgb(43, 145, 175);">AdventureWorks</span><span style="color: black;">&gt;());</p>
<p></span><span style="color: green;">// Add all complex types used by functions.
</span><span style="color: black;">modelBuilder.ComplexType&lt;</span><span style="color: rgb(43, 145, 175);">ContactInformation</span><span style="color: black;">&gt;();
modelBuilder.ComplexType&lt;</span><span style="color: rgb(43, 145, 175);">ManagerEmployee</span><span style="color: black;">&gt;();
</span><span style="color: green;">// ...
</span><span style="color: black;">}
}</span></pre></p>

The document becomes a little more readable:

image

(See: https://jsfiddle.net/dixin/tzkphu2m/1/)

It still has problems:

Questions

What should I do to render the above example document correctly?

Regarding the result HTML has wrong indent and even wrong format (<pre>...<p>...</pre></p>), is this a bug?

Thank you for help.

MihaZupan commented 4 years ago

It's unfortunate every line is in a separate pre region.

The output without the IndentedCodeBlockParser does seem a bit weird, I'll look into that.

Dixin commented 4 years ago

Thank you @MihaZupan for looking into this.

It's unfortunate every line is in a separate pre region.

What should I do if I want to keep everything unchanged between <pre> and </pre>?

MihaZupan commented 4 years ago

I haven't looked into how we would correct/change the behavior here, but it looks like it's caused by how we handle paragraph continuations.

As a workaround rn, you can add an extra line after the paragraph (note the empty line after HTML Paragrap):

## MD Heading

MD Paragraph

<p>HTML Paragraph</p>

<pre class="code"><span style="color: blue;">public partial class </span><span style="color: rgb(43, 145, 175);">AdventureWorks </span><span style="color: black;">: </span><span style="color: rgb(43, 145, 175);">DbContext
</span><span style="color: black;">{
    </span><span style="color: blue;">protected override void </span><span style="color: black;">OnModelCreating(</span><span style="color: rgb(43, 145, 175);">DbModelBuilder </span><span style="color: black;">modelBuilder)
    {

...

This will render properly.

image

Dixin commented 4 years ago

@MihaZupan Thank you for the reply. I have identified the problem and have a code fix. I also identified the source of this problem comes deeper from common markdown spec. Later I will file an issue for markdig and an issue for common markdown, with all details and code fix.

MihaZupan commented 4 years ago

Thanks!