xoofx / markdig

A fast, powerful, CommonMark compliant, extensible Markdown processor for .NET
BSD 2-Clause "Simplified" License
4.38k stars 453 forks source link

Using the Same Source Markdown... Markdown.ToHtml() != document.ToHtml() #811

Open nepgituser opened 3 months ago

nepgituser commented 3 months ago

Hello. Yesterday, I tested converting an MD file to HTML (html1 below). This worked fine. Then I decided to try parsing the markdown so I could make changes to it. Before spending a bunch of time on that (new to Markdig), I wanted to make sure that I could get it to output correctly (html2 below). This did not work. Below is the code I'm using. I would expect html1 and html2 to be identical, however, they are not.

string markdown = File.ReadAllText($@"{mdPath}\Documentation.md");

var document = Markdown.Parse(markdown);
var pipeline = new MarkdownPipelineBuilder().UseAdvancedExtensions().Build();

string html1 = Markdown.ToHtml(markdown, pipeline);
string html2 = document.ToHtml(pipeline);

If the Documentation.md file in the example above only contains "# General" (no quotes), html1 is <h1 id="general">General</h1> and html2 is <h1>General</h1>.

The ToHtml source formats are different (markdown vs MarkdownDocument), but both ultimately came from the same source markdown. And they run through the same pipeline, which I assume is where all the logic is. What am I missing? Why does one H1 have an ID but the other does not?

xoofx commented 3 months ago

You need to pass the pipeline to both Parse and ToHtml or by looking at what is behind ToHtml:

var pipeline = new MarkdownPipelineBuilder().UseAdvancedExtensions().Build();
var document = Markdown.Parse(markdown, pipeline);

string html1 = Markdown.ToHtml(markdown, pipeline);
string html2 = document.ToHtml(pipeline);

https://github.com/xoofx/markdig/blob/dfa2c94b88fdb36dd5446054b281d13bda5ef87b/src/Markdig/Markdown.cs#L95-L104

nepgituser commented 3 months ago

I would have expected one of the pipeline "passes" to be redundant, but as you showed, that is how the code works. Suppose that would come into play if, like I'm trying, the document is changed after the initial parse.

At any rate, thank you for the quick and detailed response.