Consider allowing `innerHTML = ''`

koto commented 3 years ago

Pointed out by @mozfreddyb in https://twitter.com/freddyb/status/1412734759906783236.

el.innerHTML = '' is a very common sink in web applications migrated to Trusted Types (https://docs.google.com/document/d/1m91JZWKAGOR3jQoicMVE9Ydcq79gM2BetcRIBemrex8/view#heading=h.9yr1d07740qg). Though it is possible to rewrite such code, either to use trustedTypes.emptyHTML or replaceChildren, all those ways require code changes, sometimes not in a 1st party code, but in one of the dependencies.

Since empty string assignment to innerHTML can never cause XSS (that's why trustedTypes.emptyHTML exists), it might be in "the spirit" of Trusted Types to allow that - and thus facilitating the TT adoption.

Spec-wise, it requires just a small addition to https://w3c.github.io/webappsec-trusted-types/dist/spec/#get-trusted-type-compliant-string-algorithm, before the default policy is called - at that point the value is already stringified, so no toString tricks should be able to trigger the bypass.

mozfreddyb commented 3 years ago

When tackling DOM-based XSS in Firefox (albeit with various different approaches, see paper), the top-most occurrence of innerHTML was an assignment of the empty string.

I strongly believe that you'll solve a lot of adoption pain by allowing this code.

mikewest commented 3 years ago

On Twitter, I said:

I'm not sure I like this, though I agree that it's zero additional risk. I'm a bit worried about its precedent. Empty strings are ok, sure. But what about "1"? Surely that can't cause XSS! And what about "This is a plain string without any HTML at all!"? Where's the brightline?

I do think allowing empty string assignment is probably fine, and can be reasonably justified by the developer annoyance we'd avoid. That said, I do worry that this sets up a slope, and that that slope isn't terribly frictionful. Can we carve this out without leading inexorably to some complicated heuristics to determine if a given string is likely-enough to cause XSS to block?

koto commented 3 years ago

Can we carve this out [...] ?

I think we can. Literally, the change could be:

If input has type expectedType, return stringified input and abort these steps.

If expectedType is TrustedHTML and input is an empty string, return an empty string and abort these steps.

Let convertedInput be the result of executing Process value with a default policy with the same arguments as this algorithm.

which doesn't give any impression or promise that this would be extended to allow other payloads.

arturjanc commented 3 years ago

I vaguely remember us having a discussion about this where the takeaway (or at least my takeaway) was that this case is fairly easy to solve application-wide by using the default policy. That is, the default policy could exempt empty string assignments and/or any other assignments of known-safe static values to innerHTML or other DOM sinks. Couldn't that be an alternative here?

mikewest commented 3 years ago

Can we carve this out [...] ?

I think we can. Literally, the change could be:

I asked the question poorly. :) Of course we can literally carve it out both in prose and practice. I was more concerned with the justification for doing this but not doing all the other things. As a not-terribly-crazy example, consider your suggestion that we allow el.innerHTML = 'noanglebrackets';. What's the principle that says "Do this, but don't do that."?

shhnjk commented 3 years ago

Can we carve this out [...] ?

I think we can. Literally, the change could be:

I asked the question poorly. :) Of course we can literally carve it out both in prose and practice. I was more concerned with the justification for doing this but not doing all the other things. As a not-terribly-crazy example, consider your suggestion that we allow el.innerHTML = 'noanglebrackets';. What's the principle that says "Do this, but don't do that."?

IMO, empty strings are fine as it is. Other static HTML assignments requires new primitive in HTML, such as discussion in https://github.com/WICG/sanitizer-api/issues/102.

koto commented 3 years ago

@mikewest:

As a not-terribly-crazy example, consider your suggestion that we allow el.innerHTML = 'noanglebrackets';. What's the principle that says "Do this, but don't do that."?

Fundamentally, nothing blocks us from doing that as we know this can't cause DOM XSS. I think TT are not the greatest fit against the general class of script gadgets, and with that scoping we only focus on preventing DOM XSS.

We know all the sinks can cause script execution; those are already covered by TT. For some of those sinks we know which payloads will definitely NOT cause script execution - e.g. innerHTML without angle brackets. Note that modifying script content is already guarded on top of that, so script.innerHTML = 'evil() would not be a problem.

The litmus test for this is "can one do an equivalent operation already without a Trusted Types policy?". DOM's textContent or replaceChildren(textNode), ~~or innerHTML = trustedTypes.emptyHTML~~ can do the equivalent operation, so it is safe to allow el.innerHTML = 'noanglebrackets', just like it's safe for the Sanitizer to produce DOM trees.

However, despite that I think it makes sense to limit the API and not end up on a slippery slope. For one, it would be impossible to rescope the API further on to tackle script gadgets. Secondly, discouraging innerHTML and other DOM XSS sinks is a good thing, and I would be very glad if TT contributed to that. Thirdly, we don't have a huge consensus on how broad should the TT restrictions be, and without that, we should not make sweeping changes - like allowing noanglebrackets.

With all that, I think it's fine to special case empty HTML string, given only the prevalence of the pattern, and the absolute surety that it can't result in DOM XSS, unless the application is specifically written to contain a script gadget that demonstrates that.

@arturjanc

Couldn't [the default policy] be an alternative here?

It could (or, in an even more hacky way, just overriding the innerHTML setter). I'm trying to find a way without a default policy. It has certain problems (runtime cost or its singleton nature) that I'm trying to find a way to solve for this common case without pushing the applications down the default policy route.

securityMB commented 3 years ago

Correct me if I'm wrong but I think we cannot distinguish whether this is a static assignment of an empty string vs. a dynamic assignment of a variable that just happens to be an empty string?

So assuming that's the case, then both assignments would work:

elem.innerHTML = '';

// but evil just happens to be an empty string
elem.innerHTML = evil;

If that's the case then I think that the mental model around trusted types is a little bit more complicated than it should be. I guess it also might be surprising for developers than the code might or might not work depending on the value of evil.

koto commented 3 years ago

Yes, that's correct. That's the tradeoff; we either force the developers to rewrite all of the sink assignments, even if they cannot possibly cause XSS, or introduce this surprise element. I'm leaning towards the latter, as indeed this violation is very common, and there's plenty of evidence that elegance loses to convenience when it comes to adopting security features.

Note that this is already the case in multiple places in the platform. element.src = value mail fail, depending on both element and value. Various CSP settings (script-src, child-src, img-src) can cause failures here. To fully avoid surprise, the default policy would have to go (and it can introduce even more strange behavior by changing the value and introducing side-effects).

koto commented 3 years ago

FWIW, we're seeing .innerHTML = '' across hundreds of separate products. This is really a prevalent pattern, and will likely result in friction for most applications adopting TT. We can and are coping with that friction, but it's much more difficult for smaller web shops to influence their dependencies. I am of the opinion that - especially for security features - the friction is only absolutely needed where there's a sufficient security benefit. I don't think there is one here.

There are cases where this change would end up causing surprise behavior (e.g. @securityMB's assignment of a dynamic value), but there are ways of making this less painful. For example, static checks (or even your language DOM typings) can still be more strict than the browser API. Similarly, there are linters that disallow dynamic innerHTML assignment, but are OK with constant assignments, despite HTML allowing both. Finally, there's report-only mode, and CSP reporting in general to make sure you're not breaking production.

Given this, does anyone strongly oppose to letting innerHTML = emptyString (and only this) as outlined in TT?

koto commented 3 years ago

This could also be configurable by an opt-in flag, if needed. Since require-trusted-types-for controls the sinks, require -trusted-types-for 'script' 'allow-empty-html'? I don't think that's absolutely necessary, as it's too verbose for what it offers, but this option is on the table as well.

mikewest commented 3 years ago

Given this, does anyone strongly oppose to letting innerHTML = emptyString (and only this) as outlined in TT?

For clarity: I don't strongly oppose it. I do think it's a more difficult boundary to explain, and I'd prefer for the API to be less rather than more magical in its behavior. I don't see as much advantage to adding it as you're claiming, but I grant that you've been doing migration work and I haven't, so I can accept your assertion that the benefits outweigh.

That said, it does seem like what we really want here is the distinction between literal strings and variables containing strings that we can't have without changes to the underlying language. If we had that, I think it would be pretty reasonable to strongly oppose this change. Since we don't...

shhnjk commented 3 years ago

I think I agree with the change.

I'd usually create the following utility function when I do migration to Trusted Types, which has almost zero dev/security benefit.

function emptyHTML() {
  if (window.trustedTypes) {
    return window.trustedTypes.emptyHTML;
  }
  return '';
}

Ultimately, I think Trusted Types shouldn't throw violation for something that's known to be safe under Trusted Types' threat model.

arturjanc commented 3 years ago

I think it would be a bit silly to object to this proposal; it's a minor ergonomic change which will make TT adoption easier, it seems simple spec- and implementation-wise, and it doesn't reduce the protections offered by TT.

I do, however, share some of @mikewest's skepticism about the impact here. Specifically, my guess is that given the simplicity of refactoring this to be TT-compatible (s/innerHTML=''/textContent=''/) and the existence of the default policy, developers have tools at their disposal to fix such code without a lot of work. I can see how it would make life easier for applications/libraries that have TT violations only from empty string assignments, but in cases where a codebase has other TT violations and its author wants to make it compatible, empty string assignments seem like low-hanging fruit that would be the easiest to fix.

The cost we'd be paying for this is complexity, as @mikewest and @securityMB noted above: there's some new magic behavior and innerHTML = foo becomes harder to reason about. We'll also have to consider the backcompat of this change, i.e. the fact that innerHTML = '' will throw for users on browsers without this change, so developers will have to be careful about relying on browser support for this. But maybe the last problem is something that could be solved by the polyfill?

But overall I certainly won't be unhappy if we go this route :)

koto commented 3 years ago

It's a valid point. Let's find out if there are code bases that only have these types of violations. If they are common, adding this feature in TT would significantly simplify adopter's work. If however, these violations are intertwined with the others, some amount of rewriting or default policy needs to happen anyway.

lukewarlow commented 9 months ago

I'm not personally convinced (on first look) that this change would have a meaningful impact on migration? How common is it for a site to be setting an empty innerHtml but not using a single other DOM sink?

Like Mike west said this makes reasoning about code harder because you can't tell on first look if it will error or not.

The default policy would also seem to make this a trivial issue to fix?

lukewarlow commented 9 months ago

It's also not entirely obvious to me that allowing empty strings would be expected to be considered safe by default either. As this could lead to page breakages with 0 reporting or default policy handling being invoked?

annevk commented 9 months ago

Yeah. If Chromium shipped without making this change I don't think we should be making it now. Use the textContent setter instead.

w3c / trusted-types

Consider allowing `innerHTML = ''` #343