vsch / flexmark-java

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
BSD 2-Clause "Simplified" License
2.21k stars 260 forks source link

Level-1 heading not parsed if preceded by UTF-8 BOM #601

Closed shardulc closed 7 months ago

shardulc commented 7 months ago

Description

If a Markdown file is saved with the UTF-8 encoding with a BOM (Byte Order Mark) at the beginning, and the first element in the file is a level-1 heading, then the rendered HTML has the literal text as a <p> instead of an <h1>.

Steps to reproduce

Sample Markdown input (save as UTF-8 with BOM):

# Hello world

Expected behavior

Output HTML contains:

<h1>Hello world</h1>

Observed behavior

Output HTML contains:

<p># Hello world</p>

System information

shardulc commented 7 months ago

I realize that this is a client-side issue, not an issue with flexmark. flexmark is justified in assuming that the input string has been decoded correctly. Apologies for the noise.