magiclen / html-escape

This library is for encoding/escaping special characters in HTML and decoding/unescaping HTML entities as well.
MIT License
40 stars 5 forks source link

Script escaping is incorrect #1

Open lambda-fairy opened 2 years ago

lambda-fairy commented 2 years ago

Thanks for maintaining this library @magiclen!

I found an issue with the script escaping code. It appears to escape </script> only, and leave <!-- and <script> unchanged.

This is incorrect: if the HTML parser encounters the string <!-- <script>, then it enters a double-escaped state, where it will take two </script> tags to close the element.

https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements

lambda-fairy commented 2 years ago

BTW, I recommend submitting a RustSec advisory for this, so that users are aware of the issue :smile:

magiclen commented 1 year ago

Hi, I would like to know in which case that you need to use encode_script* functions with HTML comments?

lambda-fairy commented 1 year ago

One (discouraged but common) use case is to embed JSON in a page using a <script> element.

If that JSON includes user data, then a malicious user could include <!-- <script> in their input, and then corrupt the resulting page. (That might lead to XSS as well, though I can't think of a proof of concept right now.)

magiclen commented 1 year ago

After some experiments, I found <!-- and --> need to appear in pairs even in a single <script> element. Or the text after <!-- will seem to be considered as a comment.

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title></title>
</head>
<body>
    Hello
    <script>
        alert("<!--");
        alert("123");
        alert("<script>");
        alert("<\/script>");
        alert("456");
        alert("-->");
    </script>
    world!
</body>
</html>

The HTML above is correct and shows the alert dialog six times. However, if we remove -->, no alert dialog will be shown.

The following HTML code is also incorrect

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title></title>
</head>
<body>
    Hello
    <script>
        alert("<!--");
        alert("123");
        alert("<script>");
        alert("<\/script>");
        alert("456");
    </script>
    world!
    <script>
        alert("-->");
    </script>
</body>
</html>
magiclen commented 1 year ago

<!-- can be replaced with <\!--.

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title></title>
</head>
<body>
    Hello
    <script>
        alert("<\!--");
        alert("123");
        alert("<script>");
        alert("<\/script>");
        alert("456");
    </script>
    world!
    <script>
        alert("-->");
    </script>
</body>
</html>

The HTML above is correct and shows the alert dialog six times.

magiclen commented 1 year ago

Internet Explorer has the same behavior.

lambda-fairy commented 1 year ago

One (discouraged but common) use case is to embed JSON in a page using a Githubissues.

  • Githubissues is a development platform for aggregating issues.