Open KamilaBorowska opened 5 years ago
Can we have one more formatter which also escapes forward slashes (and possibly also <
, >
, &
)?
I would argue that escaping <
, >
and &
is a responsibility of an HTML escape library, and should be a rare use-case, I'm thinking about an use of JSON in <script>
, where those characters are non-issue and shouldn't be escaped.
Why would you want to escape forward slashes? They can only occur in JSON strings where they are perfectly valid JSON (and JavaScript, right?):
{ "foo/bar": "spam //////////////////"}
But there is a different danger when doing this kind of embedding: there are unicode characters that are valid in JSON strings but not in JS strings: https://twitter.com/simon_warta/status/1255505650257362944
@webmaster128: https://stackoverflow.com/a/1580682/6086311
To use serde_json in such scenarios, I had created alternative formatter https://github.com/xd009642/tarpaulin/blob/master/src/report/safe_json.rs
It would be great if such formatter is available out of the box.
@webmaster128 U+2028 and U+2029 are now allowed in JavaScript strings, however fair point in that it will only work on relatively modern web browsers (Chrome 66+, Edge 79+, Firefox 62+, Opera 53+, Safari 12+). This may or may not be okay.
Escaping forward slashes is necessary if a JSON string is embedded in <script>
tag and happens to contain </script>
. Not escaping a forward slash will cause an XSS vulnerability.
And researching more about that, <script>
and <!--
can also cause issues, so maybe the documentation should specifically discourage storing JSON in <script>
. The HTML specification has a rather complicated recommendation on avoiding the issue: https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements.
I'm not saying serde_json
should escape those characters, but I think it should document that it doesn't do that so that users would know about the issue.
There is nothing bad in embedding JSON into <script>
. JSON is a subset of JS (since ES2019).
Thanks for the answers. I continued the research a little bit to understand why this is a problem. This is described in this StackOverflow answer and the linked documents. It is not at all about JavaScript. It is about how HTML parsers detect the end of a <script>
. The following code is also valid JavaScript but is documented to break the <script>
block:
<SCRIPT type="text/javascript">
document.write ("<EM>This won't work</EM>")
</SCRIPT>
So no matter how valid your JS is, if the browser does not read it completely, you might get doomed. I don't know how HTML parsers have evolved over time and how relevant the issue still is.
@webmaster128 this StackOverflow answer is more up-to-date: https://stackoverflow.com/a/14608300/617159
@webmaster128 HTML parser doesn't care about JavaScript (And it should not).
it sees tokens differently.
<SCRIPT type="text/javascript">
document.write ("
<EM>
This won't work
</EM>
")
</SCRIPT>
I was about to crate an issue but found this one. Placing JSON in <script>
tag is valid case, e.g. when you want to rehydrate client app with server state, you have to place data into DOM so client could pick it up. Since it's perf sensitive part, it makes sense to serialize JSON + escape HTML chars and JS line terminators in single pass.
An user should manually escape the slash (for instance using https://github.com/serde-rs/json/issues/435#issuecomment-382108021) when putting JSON inside of
<script>
(yes, people do that), otherwise an XSS will occur. This ideally would be documented, as it is a potential security vulnerability.