vsch / flexmark-java

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
BSD 2-Clause "Simplified" License
2.29k stars 272 forks source link

Special characters are escaped in nested HTML elements of <pre><code> block when converting HTML to MD #545

Closed MekhailS closed 1 year ago

MekhailS commented 1 year ago

Special characters are escaped in nested HTML elements of <pre><code> block when converting HTML to MD

I am converting HTML to Markdown using FlexmarkHtmlConverter Consider HTML input:

<html>
<body>
<pre><code>
  foo { <b>-></b> it }
</code></pre>
</body>
</html>

I expect it to be converted to the following markdown:

  fun foo { -> it }

However, the character ">" is escaped, producing the output:

  fun foo { -\> it }

For conversion, I'm using the default FlexmarkHtmlConverter: FlexmarkHtmlConverter.builder().build().convert(html)