Open ExpHP opened 4 years ago
Did you find a solution?
That really depends on how you define "solution."
And that's only made somewhat simple by relying on a few key facts about the language I'm highlighting. The workaround I used to use was a bit more general, but also completely unmaintainable.
(also, both of these were only run on trusted input)
Oh, apparently I mis-remembered this issue and mistakenly thought it was an issue in highlightjs rather than showdown.
I think those snippets I posted are still related, but IIRC they are working around an even greater issue (which arises from the interaction between this bug and highlightjs), so my apologies if they seemed confusing. (Then again, the point was mainly just to show that I don't have a good solution)
After reading this comment https://github.com/showdownjs/showdown/issues/400#issuecomment-307668667 I see that it escapes angle brackets by design. I added a plugin to unescape them like so:
const showdown = require('showdown');
unescapeAngleBrackets: [
{
type: 'output',
regex: new RegExp(`<`, 'g'),
replace: `<`
},
{
type: 'output',
regex: new RegExp(`>`, 'g'),
replace: `>`
}
]
const converter = new showdown.Converter({
extensions: [
...unescapeAngleBrackets,
]
})
Well, there's a trick here. Markdown codespans SHOULD be escaped, but HTML codespans should NOT. This means
`<a></a>`
<code><a></a><code>
should become
<code><a></a></code>
<code><a></a><code>
I thought, maybe this is because they might implement the escaping after the conversion of ` `
to <code>
, when they should rather do it as part of that conversion. Looking at the code, however, it does seem that it is deliberate (just a misunderstanding of the spec), and this line which appears to perform the escaping on <code>
should probably be eliminated:
Actually, looking at this reveals even more bugs. hashCodeTags
is also "hashing" the contents of <code>
, which AFAICT stops showdown from recursing into it. This results in an even wider class of bugs:
Input:
<code>`a`</code>
Correct output: (babelmark2). <code>
should be treated as any other span element, and therefore Markdown links and codespans inside of it should be converted.
<p><code><code>a</code></code>
</p>
Showdown output:
<p><code>`a`</code>
</p>
Input Markdown:
Expected HTML output: (from 28 out of 31 converters tested on babelmark)
Actual output: (from showdown and one other parser)
Quoting Daringfireball: (emphasis added)
I included the last paragraph to emphasize that it says "Markdown code spans." My interpretation of this—backed by the babelmark link posted above—is that this phrase refers specifically to markdown backtick syntax, i.e.
`<span>a</span>`
, and not to<code>
which is an inline HTML element.