ProseMirror / prosemirror-markdown

ProseMirror Markdown integration
https://prosemirror.net
MIT License
344 stars 81 forks source link

Unnecessary escape for the period (.) character #101

Closed dragonman225 closed 1 year ago

dragonman225 commented 1 year ago

Pasting the following Markdown to the example, switch to WYSIWYG and switch back to Markdown,

123 [example.com](https://example.com)

123 [0.com](https://example.com)

123 [example.1](https://example.com)

123 [2.2](https://example.com)

4.4

* 5.5
* 6\. 6
7. 7\. 7

8\. 8

it becomes

123 [example.com](https://example.com)

123 [0\.com](https://example.com)

123 [example.1](https://example.com)

123 [2\.2](https://example.com)

4\.4

* 5\.5
* 6\. 6

7. 7\. 7

8\. 8

As far as I know, the periods in 0.com, 2.2, 4.4, and 5.5 don't need to be escaped, and escaping them makes the generated Markdown less readable.

dragonman225 commented 1 year ago

I looked into the issue and found it's related to this commit and the fix for #75

I also attached a patch and it consists of two parts.

The first part is to set this.atBlockStart = true only when the serializer is really writing to block start in renderInline(). (In current code, every text node in a block is rendered with this.atBlockStart = true.) This fixes periods in link text (0.com, 2.2).

The second part is to replace text that would actually be parsed as a list item in esc(). For example, 4.4 would not be parsed as a list item, while 4. 4 would. So I add a space character (\s) in the regex.

marijnh commented 1 year ago

Please take a look at 8f0a84fdcb16c4a544f5bbcbe41e67c203a571b6 and let me know if that also looks like a solution to the atBlockStart problem.

dragonman225 commented 1 year ago

Thank you for the suggestion. The code of 8f0a84f makes sense to me. I tried and it also worked well.

Though I'd like to suggest adding spaces after the periods in this test:

it("does not escape list markers in the middle of paragraphs", () => {
  same("123 [0.com](foo)\n\n123 [2.2](foo)",
       doc(p("123 ", a("0.com")), p("123 ", a("2.2"))))
})

e.g.

it("does not escape list markers in the middle of paragraphs", () => {
  same("123 [0. com](foo)\n\n123 [2. 2](foo)",
       doc(p("123 ", a("0. com")), p("123 ", a("2. 2"))))
})

Because after we changed the regex in esc(), without spaces the periods would not be escaped, regardless of whether atBlockStart is correct.