Closed drwpow closed 3 years ago
I think I understand. Let me repeat your concern, phrased in a different way to see if I get it.
So, given this markdown source code:
```
a => b
```
You expect the following HTML output:
<pre><code>a => b</code></pre>
If this is correct then that is not going to work since >
is a special character in HTML (as I'm sure you know.) I.e. if you instead consider this markdown:
```
<script>alert(document.cookies)</script>
```
You would get the HTML output:
<pre><code><script>alert(document.cookies)</script></code></pre>
This would be bad.
It seems to me that your end goal here is to process the code through a syntax highlighter. There may be better ways to go about that.
Option 1: you could use a syntax highlighter that works with HTML-escaped code, like highlight.js
Option 2: you could run the syntax highlighter on the markdown text, before you pass it on to markdown-wasm. However, if you do this, you won't be able to set NO_HTML_BLOCKS
or NO_HTML_INLINE
flags, which can be used to strengthen the safety of markdown-wasm, i.e. to avoid XSS issues.
Option 3: we could consider adding a feature to markdown-wasm where you set a flag, like for example CDATA_CODE_BLOCKS
that, when set, outputs code blocks with verbatim code wrapped in <![CDATA[...]]>
.
I've enabled highlight.js on the markdown-wasm website so you can try it out: https://rsms.me/markdown-wasm/#code-poetry
Try something like this and look at the HTML using your browser's web inspector:
```js
const f = () => {}
```
That’s a fair point about injection. You’re right that things between <code></code>
do need to be escaped sometimes; I was more-or-less wondering if =>
specifically needed to be escaped. I was comparing the output to remark-html
which leaves as-is, but you have a good point in that =>
should be essentially the same.
I agree that probably the responsibility lies in the highlighting library and not this parser. Maybe it’s just a fluke/bug remark-html
doesn’t escape =>
in certain scenarios.
Thanks for responding!
First of all: this is an amazing library! I love the anchor links auto-generated in headings. Fantastic 🎉
That said, there are some unexpected results when parsing code blocks. Here’s an example from Redux’s README:
This results in (caused mostly by the syntax highlighting library):
Expected behavior would be to not HTML escape anything within those code blocks.
I don’t know of a README with an actual lesser-than comparison (e.g.
x < y
->x < y
), but I’d suspect a similar behavior with that, too.If desired, I could make an attempt at a PR (but be warned—I’m a C n00b).