Open zcorpan opened 1 year ago
Alternatively, drop the scripting flag completely and always parse as if scripting is enabled, but this might regress the UX for users who disable script.
That would be a significant UX regression, given it would mean noscript
becomes meaningless, no?
The even more radical suggestion would be to parse as if scripting is disabled (so noscript
still gets parsed correctly!) but with script execution. I doubt we can manage to do this (given it'll cause a performance regression for every noscript
in the common case, and it has difficult-to-judge web compatibility implications), but it would be pretty great if the only thing that scripting being enabled changed was script execution itself. (And of course you can still have differential parser attacks in that case, thanks to document.write
, but from a sanitizer point-of-view that's "just" a matter of removing all scripts.)
As a mostly-absentee maintainer of a library containing a sanitizer (html5lib), the noscript
parsing has often scared me (because it is practically inevitable it will lead to attacks like this), and it feels like it would be a decent security win to the overall ecosystem.
Right it would be mostly useless. Maybe the display
could still be flipped depending on scripting disabled (by the user?), but showing markup as text seems broken.
Always parsing noscript
as if scripting is disabled seems like it would break web content e.g. using a meta refresh in noscript, or show duplicate content (scripted and non-scripted widgets), etc.
It would be nice if <noscript>
had different display
based on a flag (instead of different parsing). Not sure this can be made compatible though..
@mozfreddyb it does that today as well as different parsing. https://html.spec.whatwg.org/multipage/rendering.html#hidden-elements
It's not clear to me if @media (scripting)
is 1:1 with HTML's "scripting is enabled"
I could see making HTML fragment parsing better account for the state of the context node's node document. If we want to completely revisit the way this works this needs some more historical context. E.g., I'm pretty sure it was fairly recently that we did the work around sandboxing and that making that work the same as scripting being disabled was an intentional design choice.
Note that changing the way <noscript>
parses has other side effects beyond rendering. E.g., it can impact whether a subresource gets fetched.
https://html.spec.whatwg.org/multipage/parsing.html#other-parsing-state-flags says
This flag affects parsing of
noscript
, and "scripting was enabled" is disabled for an<iframe sandbox>
document (withoutallow-scripts
) as well as these:It's also disabled for
<template>
's template contents owner document, at least as implemented in WebKit and Chromium (but Gecko parses with scripting flag enabled), see http://software.hixie.ch/utilities/js/live-dom-viewer/saved/11796This has caused mutation XSS issues in the past, see https://www.acunetix.com/blog/web-security-zone/mutation-xss-in-google-search/
I think it's not great that
noscript
is parsed differently for different documents that can be accessed from script.noscript
was intended to show content to users when they had scripting disabled in the browser. Introducing parsing differences in different places invites XSS bugs.Would it be web compatible to make the HTML parser's scripting flag only depend on whether the user has disabled scripting?
Alternatively, drop the scripting flag completely and always parse as if scripting is enabled, but this might regress the UX for users who disable script.
@whatwg/html-parser @whatwg/security