tc39 / proposal-array-is-template-object

TC39 proposal to identify tagged template string array objects
https://tc39.es/proposal-array-is-template-object/
MIT License
39 stars 7 forks source link

Practical example needed #12

Closed gibson042 closed 3 years ago

gibson042 commented 4 years ago

The README case demonstrates that code can use Array.isTemplateObject to differentiate an array extracted from the static strings of a tagged template from other values, but does not demonstrate any use case in which a potential attacker has the ability to provide arguments to a sensitiveOperation function but does not have the ability to invoke it as a tagged template (or more generally, with an array that was produced from the static strings of an arbitrary tagged template).

What practical scenarios would be addressed by this method?

bathos commented 4 years ago

It allows differentiating between source text and dynamic text. Using a CSP to block eval, template tag spans will only be actual source text, which may be trusted while dynamically generated code is not trusted.

It’s true that it could pass in an array from another tagged template, but even so, that array would have to have been source text in this scenario also.

ljharb commented 4 years ago

Source text as in, a dynamically constructed and imported data URI?

bathos commented 4 years ago

If the CSP permits data: sources for script-src or the importing script otherwise has authority that extends to that (strict-dynamic, not sure if there are other such cases now), I would think yes, and otherwise no ... but I’m not really sure what the relationship between CSP/script-src and dynamic import is now.

gibson042 commented 4 years ago

I know what it allows, but I'm still looking for a demonstration of where that matters.

bathos commented 4 years ago

It’s closely related to the goals of the proposed Trusted Types API but also the general concept of trusted/untrusted source, which is a strategy for reducing XSS surface. That pattern has been implemented in various libraries including Angular.js (the $sce service) and Angular (the SafeValue class and its subclasses).

These strategies involve explicit whitelisting of string values to signal that they should be considered valid examples of specific media types or URLs. So an API might accept an object trusted as HTML which wraps some string value, but not strings on their own.

A limitation of these approaches is that they don’t allow establishing constraints related to the provenance of the source to be trusted — the API may only be available to privileged code, but it can’t be sure that the input it receives wasn’t dynamically generated (e.g. taken from user input or input to publicly exposed API). A significant amount of XSS is enabled by mistakenly trusting program input as arbitrary media, especially as HTML or ES source.

Because ‘authentic’ template string arrays can only be created through syntax, the ability to detect whether they are or aren’t ‘authentic’ (isTemplateObject) could provide this missing means for establishing that the input was a static facet of source text (when paired with a CSP that blocks eval). There is no way to achieve perfection here since a workaround is always one ‘level’ away (e.g. compromised dynamic resources, JSONP, etc) but shrinking the door to chaos is still an effective strategy for reducing the odds of successful XSS.

I suspect the most likely scenario where an attacker would be able to provide strings but not be able to invoke the template tag is an indirect one: a bug in the trusted code that passes values which should not be trusted. More broadly though it’s more practical to audit static source than dynamic sources. Requiring all media of some type to be statically sourced is a deliberate reduction in power for the sake of narrowing the scope of things that can go wrong.

This is my understanding at least; I’m not an expert or anything. Does that clarify stuff at all or am I missing what you’re asking?

jridgewell commented 4 years ago

See also https://github.com/mikewest/tc39-proposal-literals#motivation, which lists goog.SafeHtml, goog.SafeUrl and goog.string.Const (closure compiler values).

gibson042 commented 4 years ago

I've read that page, and trusted types as a concept are fairly well explained. But here, there's a lot of grand language and a dearth of demonstration. I'm not saying this proposal can't deliver on its promises, I'm just asking for something real that backs them up. Is that difficult to provide?

bathos commented 4 years ago

It might be tough to provide in the form of a code snippet. I think claims that static source constraints provide auditability and a reduction in the odds of evaluating untrusted source are sound, but I’m embarrassed to admit I don’t know how to properly illustrate either, especially in a README friendly form, since it’s (as I understand it, anyway) not about specific attack vectors but rather about establishing known/knowable limits. Hopefully one of the Google folks will be able to furnish a better answer, given they have more background here. (Apologies for misunderstanding what you were looking for — didn’t intend to fall back on grand language if you meant in my responses, but if I did perhaps I should take that as a signal that I’m speaking more out of my depth than I was aware.)

gibson042 commented 4 years ago

That may not have been the right term anyway, "handwaving" is probably more accurate. By way of contrast, there's a straightforward demonstration of how trusted types prevent careless use of URL fragment data (arbitrary user input) as HTML from becoming a vulnerability, and how safe use of that input remains possible. So what's the equivalent here—e.g., what kind of vulnerability exists without isTemplateObject that can be protected against with it?

bathos commented 4 years ago

the use of gestures and insubstantial language meant to impress or convince

ah, ouch. well, fwiw, the intention was to be helpful — seems I missed the mark pretty badly!

what kind of vulnerability exists without isTemplateObject that can be protected against with it?

Although I don’t think I can provide the kind of example you’re looking for, here’s at least a very direct (if contrived) one. Assume html is a template tag provided by a library which wants to automate handling trusted types, like in lit-html, and the following is in our own application code:

html`source`;          // successful, evals source as HTML via an HTMLTemplateElement
html([ prompt('?') ]); // throws

The combo of Trusted Types, isTemplateObject, and no-eval (either via CSP or running in a SES realm or something) is letting us transform ‘no eval of js’ into ‘no eval of js or html’. Above lit would throw rather than trust on line 2 because it knows only template arrays are guaranteed to not contain dynamic source (assuming those env parameters are satisfied). In this example, it does contain dynamic source, and it’s arbitrary user input specifically.

(Figured I’d have a last try in the hopes that maybe this is at least an improvement.)

mikesamuel commented 4 years ago

Will add examples to the explainer.

@gibson042 I'll try to avoid using grand language, but I'd like to ask a few questions to make sure examples address your core concerns.

The README case demonstrates that code can use Array.isTemplateObject to differentiate an array extracted from the static strings of a tagged template from other values, but does not demonstrate any use case in which a potential attacker has the ability to provide arguments to a sensitiveOperation function but does not have the ability to invoke it as a tagged template (or more generally, with an array that was produced from the static strings of an arbitrary tagged template).

Re the bolded text, I think there is value in tagged templates that sit in front of sensitive operations to provide safe abstractions.

Safe abstractions are often built using unsafe APIs. Google's internal toolchains try to limit the code that can use known-unsafe APIs.

Would you be happy with examples that assume a way to limit access to sensitiveOperation or do you want to see those moving parts as well?

Re "potential attacker", I haven't in my TC39 presos, explicitly stated the threat model. Do you have questions around who is an attacker? @waldemarhorwat talked about "confused deputy" and he's right that a lot of this work in limiting access to error-prone APIs involves guiding developers away from error-prone patterns towards safe abstractions so that the resulting code is less likely to be confusable. This means that the actors are not just (attacker, defender). Would you like to see that addressed as well or is that secondary to you?