tommoor / slate-md-serializer

A Markdown serializer for the Slate editor framework
MIT License
64 stars 36 forks source link

Wrong parsing of paragraph #30

Open scottge opened 5 years ago

scottge commented 5 years ago

Let's say that the input string is

paragraph 1
<blank line>
paragraph 2

(between paragraph 1 and 2, there is a blank line.

Your deserialize output would be 3 blocks of 'paragraph'

However, according to the markdown spec, shouldn't it output only 2 blocks of 'paragraph'?

scottge commented 5 years ago

Does this problem exist in the parent project? netlify/slate-markdown-serializer

Maybe I should replace this deserializer in the rich-markdown-editor

tommoor commented 5 years ago

Does this problem exist in the parent project?

You should try it out, it will definitely act more similar to the original Markdown spec. However there has been over a year of bugfixes since it was forked from that project too…

Honestly, If you're looking for a generic markdown editor you might be best not using this component as it's very much only designed to consume markdown that it produces. In the wild there are a lot of whacky nested markdown combinations that could cause it to crash

olipo186 commented 4 years ago

I believe this problem originates from the fact that "hard line breaks" in the editor (rich-markdown-editor) is serialized as "blank lines" (\n) when converting to markdown. When the lib then runs the inverse function - when the markdown is parsed back into a document model - the "blank lines" (\n) needs to be interpreted as empty paragraphs to achieve a document model equal to the original.

I recently read up on the CommonMark spec v0.29 and found a quotation that is related:

A line break (not in a code span or HTML tag) that is preceded by two or more spaces and does not occur at the end of a block is parsed as a hard line break (rendered in HTML as a
tag)

Maybe slate-md-serializer could be changed to adopt this behavior and render "hard line breaks" as " \n" (prepended by two spaces) instead of just "\n" - and then apply the same inverse behavior to the parser?

Edit: Found this issue https://github.com/tommoor/slate-md-serializer/issues/19 that is related.