quantizor / markdown-to-jsx

🏭 The most lightweight, customizable React markdown component.
https://markdown-to-jsx.quantizor.dev/
MIT License
2.01k stars 174 forks source link

Backslash escape fails to escape special characters. #271

Open twall opened 5 years ago

twall commented 5 years ago

With the input text _{over\_flow}_, I would expect to see

_{overflow}

but markdown-to-jsx renders it as

{over\_ flow}

I'm doing my own interpolation between the braces, to allow variable substitution and filtering (it'd be nice if the compiler did it, but for now I'm post-processing properties and text).

quantizor commented 5 years ago

Hmm yeah that should be fixed, thanks

lowellstewart commented 4 years ago

@probablyup ... I'm working on a fix for this, and it seemed like a natural fix (given the way parsing works) is to tweak the regexes that parse "paired" markdown tokens, to prevent specific escaped characters from interfering with proper detection of the markdown.

I tweaked TEXT_BOLD_R, TEXT_EMPHASIZED_R, and TEXT_STRIKETHROUGHED_R in this way -- very minor changes:

const TEXT_BOLD_R = /^([*_])\1((?:\[.*?\][([].*?[)\]]|<.*?>(?:.*?<.*?>)?|`.*?`|~+.*?~+|.)*?)\1\1(?!\1)/;
// to
const TEXT_BOLD_R = /^([*_])\1((?:\\\1|\[.*?\][([].*?[)\]]|<.*?>(?:.*?<.*?>)?|`.*?`|~+.*?~+|.)*?)\1\1(?!\1)/;

I'm reviewing the other regexes to see if similar changes can avoid escaping problems. Does this seem like a reasonable approach? If so I'll prepare a merge request.

davidnduffy commented 11 months ago

I'm not sure if this is related but I have a problem with a backslash simply disappearing from the rendered text in a link. e.g. [Some\text](https://some.link/?param=some\text) comes out as <a href='https://some.link?param=sometext'>Some\text</a> with the backslash missing in the URL param portion.

mrelarsen commented 8 months ago

I'm not sure if this is related but I have a problem with a backslash simply disappearing from the rendered text in a link. e.g. [Some\text](https://some.link/?param=some\text) comes out as <a href='https://some.link?param=sometext'>Some\text</a> with the backslash missing in the URL param portion.

I have not looked in the code base, but a backslash should be illegal in a url so I would imagine that a sanitizer removes it to have a valid url.