pragdave / earmark

Markdown parser for Elixir
Other
859 stars 135 forks source link

em-dash cannot be escaped #464

Closed tusooa closed 11 months ago

tusooa commented 1 year ago

downstream: https://git.pleroma.social/pleroma/pleroma/-/issues/2810

When smartypants is on, \-\- is converted to . This behaviour sounds problematic, because it is not two consecutive dashes.

iex(3)> Earmark.as_html("hey @yyy@xn\\-\\-i2raa\\.com")
{:ok, "<p>\nhey @yyy@xn–i2raa.com</p>\n", []}

This causes problems for our workflow, because it represents a domain name and thus should not be converted.

RobertDober commented 1 year ago

Ty for this report, unfortunately the smarty pants option is deprecated, which version of Earmark do you use?

tusooa commented 11 months ago

earmark 1.4.22 and earmark_parser 1.4.32

RobertDober commented 11 months ago

I see what you mean but why should we not create a dash?

In links we do not

Earmark.as_html("hey @<yyy@xn\\-\\-i2raa\\.com>")
{:ok,
 "<p>\nhey @<a href=\"mailto:yyy@xn%5C-%5C-i2raa%5C.com\">yyy@xn\\-\\-i2raa\\.com</a></p>\n",
 []}

but switching smartypants off is what you need outside

iex(1)> Earmark.as_html("hello--world")
{:ok, "<p>\nhello–world</p>\n", []}
iex(2)> Earmark.as_html("hello--world", smartypants: false)
{:ok, "<p>\nhello--world</p>\n", []}
iex(3)>
tusooa commented 11 months ago

I see what you mean but why should we not create a dash?

In links we do not

Earmark.as_html("hey @<yyy@xn\\-\\-i2raa\\.com>")
{:ok,
 "<p>\nhey @<a href=\"mailto:yyy@xn%5C-%5C-i2raa%5C.com\">yyy@xn\\-\\-i2raa\\.com</a></p>\n",
 []}

but switching smartypants off is what you need outside

iex(1)> Earmark.as_html("hello--world")
{:ok, "<p>\nhello–world</p>\n", []}
iex(2)> Earmark.as_html("hello--world", smartypants: false)
{:ok, "<p>\nhello--world</p>\n", []}
iex(3)>

@yyy@xn\\-\\-i2raa\\.com is a fediverse user handle.

Our workflow in Pleroma is (1) escape special characters in the user handle, (2) pass it to earmark, and (3) convert the handle to profile links.

And even if it is handled as a link it is still buggy: the text of the link should also not have the em-dash -- it should have the double ascii dashes instead.

iex(1)> Earmark.as_html("hey @<yyy@xn--i2raa.com>")      
{:ok,
 "<p>\nhey @<a href=\"mailto:yyy@xn--i2raa.com\">yyy@xn–i2raa.com</a></p>\n",
 []}
RobertDober commented 11 months ago

And even if it is handled as a link it is still buggy: the text of the link should also not have the em-dash -- it should have the double ascii dashes instead.

Matter of taste I would say, but if your desired behavior would solve your problem I would defenitely consider accepting a PR.

That said, is using smartypants: false not fixing your issue?

tusooa commented 11 months ago

Switching off smartypants fixes the issue, but as Pleroma is a piece of software that advocates for customizability, it is apparently better if this option is customizable without breaking anything.