Closed elrido closed 2 years ago
More information about CodeQL: https://codeql.github.com/ (You can set it up on GitHub and it is really nice.) More information on that warning: https://codeql.github.com/codeql-query-help/javascript/js-incomplete-sanitization/
This block of code is not for sanitization or security purposes. This is a step in converting HTML to markdown, to prevent creation of markdown by mistake.
For instance, take this input:
<p>nickname: _randomNickname_</p>
The md equivalent would be:
nickname: \_randomNickname\_
backslashes are added since a word wrapped in underscores means emphasis in markdown. This block of code does this
Thank you for your time and assessment. Wouldn't markdown need to escape backslashes (legal in HTML), tough? i.e.:
<p>file path: C:\random\windows\path</p> -> file path: C:\\random\\windows\\path
No. Backslashes are literal in markdown, except when preceding the following characters:
\ backslash
` backtick
* asterisk
_ underscore
{} curly braces
[] square brackets
() parentheses
# hash mark
+ plus sign
- minus sign (hyphen)
. dot
! exclamation mark
So, in your example, writing C:\\random\\windows\\path
and C:\random\windows\path
is functionally the same thing, regarding output.
We could escape them, but weird stuff might arise from it. (although I will look into it).
Regarding your question, it seems to me that this is indeed a false positive.
Yeah, the docs are a bit vague on that point, backslash is listed as something that can be escaped, but it is probably a bit of undefined behavior how different rendering engines might handle encountering a backslash not followed by one of the characters listed - most would probably just ignore it.
I could only imagine this to be a risk in edge cases, if several HTML snippets get converted to markdown and finally assembled and it so happens to change the output when rendered as a whole.
Given all this and that the project I work on only converts whole markdown documents to HTML and not the other way around or backwards and forwards I think it is not a concern for our use case.
Thank you again for the feedback!
When updating to showdown 2.0.0 in another project, the CodeQL static code analyser used there reported an issue with the following line: https://github.com/showdownjs/showdown/blob/32a1aaa39b39cff6d9a85b070d64db6829b79b26/src/subParsers/makeMarkdown/txt.js#L15-L19 The report states that:
It goes on to recommend to "Use a (well-tested) sanitization library if at all possible.", which showdown is, IMHO.
I'm not sure if this really is an issue or a false positive by an overzealous tool and wanted to get a second, human opinion.