Closed littledan closed 4 years ago
Provenance of literal strings will I think continue to be an essential building block.
While it is true that for many types, such as TrustedURL, it's possible to apply run-time sanitization or escaping in many common scenarios, there are other situations where that's not practical. In particular, for TrustedScriptURL it's often not practical to state a runtime predicate that determines if a given string is in fact a URL to a trustworthy script resource. In those cases, using the property that a value is constructed from trustworthy, programmer-controlled components is very useful.
For reference, the Closure library's corresponding TrustedResourceUrl type has factory functions to create values from literals and from literal format strings (with certain constraints). Closure currently relies on a compiler check to ensure the "compile-time-constness" primitive.
The TemplateLiteral mechanism would be an elegant way to achieve this without having to rely on a compiler check.
Note that it might not be not necessary to directly reference the TemplateLiteral mechanism in the TrustedTypes standard. Rather, TemplateLiteral can be used as a building block when implementing a TrustedTypes policy; in particular the part of a policy that produces TrustedScriptURL. This would be very useful when implementing a policy that provides TrustedTypes builders and factory methods analogous to the ones in the Closure.
Thanks for the fast feedback. If you think this will be useful, I'll add it to the next TC39 meeting's agenda to see if we can get quick consensus on it.
That's an interesting idea to use this as part of a TrustedTypes policy. This may need some tweaks to make it usable, both on the details of the TrustedType constructors, as well as requiring that we have a JavaScript API to check literalness, as I previously proposed, but omitted in the more recent PR (to side-step the discussion of where to put that function).
What do you see as the MVP here? From a TC39 perspective, just exposing the internal slot is a more minimal change, but I can also see how delegating the choice to policies is more minimal from a TrustedTypes perspective (as well as more flexible long-term).
@koto and @mikewest might have a better sense what an MVP would look like.
A self-contained JS API that can be used to assert literalness would seem desirable to me: Rather than tying the notion of literalness directly into web-platform-defined semantics of TrustedTypes, it'd provide a nicely orthogonal and independent building block.
Also adding @mikesamuel who has been looking at these considerations in the context of literal HTML template snippets.
I think the idea of a "real" template object is a great one.
I think most uses of transpilers now output a language version with tagged template literals so the language
12.2.9.4 Runtime Semantics: GetTemplateObject
The abstract operation GetTemplateObject is called with a Parse Node,
shouldn't present too much of an obstacle to transpilers preserving the literalness of a tagged template.
Anything based on the identity of the template object should survive Proxy tricks like
const url = sneaky`https://example.com`;
function sneaky(strings) {
return TrustedURL(new Proxy(
string,
{
get: function (target, key) {
if (key === '0') { return 'javascript:alert(document.domain)'; }
return Reflect.get(target, key);
}
}));
}
Would there be a concern if i was able to save a "real" template object, and then pass it into another tag function without using the syntax?
@ljharb, I don't think so.
I think it'd be fine if
TrustedHTMLWrapper`...`
called through to TrustedHTML
and then wrapped the result with more metadata derived from strings.
IIUC, the TT project goals do not include preventing malicious third party libraries from abusing their clients' trust, so it's ok that a dev could write
TrustedURL4Realz`...`
and might believe that TrustedURL4Realz
creates a TrustedURL when it actually uses the more privileged TrustedResourceURL.
To the Proxy trick, the internal slot would not be present on any Proxy, so the TrustedURL constructor could throw an error.
What if we do this in two phases, with Phase 1 being expose the literal-ness to web builtins, and Phase 2 being expose it to JS APIs as a building block? I think both will be useful, but I don't want to block this capability of trusted types on the details of the API for JS developers.
Re the proxy trick, @koto can comment on how important it is to prevent that, but if it requires no extra work, great.
This might affect membrane transparency, but most of the use cases I've seen for membranes are for security reasons so requiring the membrane author to explicitly open a hole for TT might be a feature.
The membrane author would have to open a hole by giving the original, underlying template object. Not sure how well this would suit membrane systems.
@littledan, it only matters in the case where the template object is on one side of the membrane and the tag's referent is on the other side. The membrane author can either open a hole for the tag's referent one way, or open a hole for the template object the other way.
But I think typically they'd do as you say since the template object is deeply frozen.
@littledan, Unless, isRealThingamabobber
is on the same side of the membrane as the template tag's referent in which case they don't need to open any holes. Membranes hurt my brain.
There is still an interest in tracking literal strings - assuring that values have been created from syntax is good for TT. At the very least it allows us to create TT policies that are easier to reason about (and this is crucial for the actual security benefit of the policies-based API).
We might be able to leverage that mechanism as well in next iterations of the API - either to define platform-built-in policies ('builtin-trustedurl-fromliteral'), or type constructors directly (TrustedURL.fromSameOriginURL/path/only?foo=${bar}
), but for the time being I think it's best to use it as a building block for actual userland policy implementations.
All of that barring the actual JS check API being available, but the Phase 1 is a great direction, thanks for following up on that!
Like @mikesamuel said, reusing the template literal with another tag is a minor concern for TT, and we could workaround that, but still - if it's simple to avoid, and it's just a spec change with negligible implementation differences, why not? I lack the expertise to know if that's the case though.
Thanks for the feedback; I will think more about how we could avoid this.
We discussed the PR https://github.com/tc39/ecma262/pull/1350 in the November 2018 TC39 meeting. Delegates expressed many concerns about this direction for the trusted types proposal, including:
eval
is possible. I was assuming it would be possible to disable eval
through CSP, but it was pointed out that correct adoption of CSP has been slow. I agree that it's not useful to think about what's literal if CSP is not enabled.However, if you want to go in this direction anyway, it's possible to do without a change from TC39. Layering changes like the one I proposed in https://github.com/tc39/ecma262/pull/1350 are intended to clean things up, but HTML and the Web Platform make wide use of things which are "associated with" JavaScript objects, without "properly" layering the changes into ECMAScript. Multiple TC39 delegates pointed out that the Web Platform doesn't need permission from TC39 to make a change like this; the layering change would just make things cleaner.
The trusted types specification could use wording like, "If obj is a template object which was produced from the GetTemplateObject operation, ..." to distinguish template objects from other objects. The trusted types specification could package this up into a function which is passed to the trusted types policies.
proposed mechanism for private fields, now at Stage 3, does not respect Proxy transparency
Ah, now I understand the surprise. I meant membrane transparency.
Reasoning about program equivalence should not be reasoning about equivalence of semantic state. I was wrong on this. It should be reasoning about possible observation and possible differences in observation.
Thus it is a bug that the tc39 spec is silent on what JavaScript states hosts can distinguish. Hosts should not be free to make distinction that we have, by design, made indistinguishable. Otherwise no reasoning about, for example, correctness preserving transformation is possible.
@erights Maybe the first step would be documenting the motivation for making them indistinguishable, to persuade hosts not to distinguish them? I don't understand this motivation, personally.
Ah, now I understand the surprise. I meant membrane transparency.
What do you think we'd need for membrane transparency, that differs between the internal slot here and private fields?
Not speaking about the specific template here, but rather about the general issue: @domenic made a good proposal at a recent tc39 meeting (link?) that would have new internal slots not be visible cross-realm. This would make them equivalent to the weakmap-like or weakset-like model of branding and private state, and so be transparent across membranes.
For this specific case, this would make it equivalent to branding via a per-realm weakset. This is likely fine. At least it would not have any fatal problems that come to mind.
(Even in your CSP usage, since CSP controls are per realm, you probably need this branding to be per realm anyway for the security properties you seek, if you're going to get them via CSP. But as noted, we should not use CSP for security.)
(Offtopic, but: How are cross-realm brands not like a WeakSet? To me, it just seems like a WeakSet that's shared between multiple realms. I don't understand how this differs with respect to membranes.)
(Good question. There are no cross-realm weaksets. One of the important invariants of realms is that a fresh realm is isolated from everything else --- no shared mutable state --- except that the creator has access to the created. That's one of the reasons a membrane can transparently emulate a realm boundary and provide strong separation: there's nothing that's already on both sides of the membrane. Don't cross the streams!)
We discussed the PR tc39/ecma262#1350 in the November 2018 TC39 meeting.
Thanks for featuring this, Daniel!
Delegates expressed many concerns about this direction for the trusted types proposal, including:
- Many pointed out that the literal-ness guarantee means nothing if
eval
is possible. I was assuming it would be possible to disableeval
through CSP, but it was pointed out that correct adoption of CSP has been slow. I agree that it's not useful to think about what's literal if CSP is not enabled.
It's definitely less useful in the presence of eval (though I understand it might be a fundamental problem from the language spec perspective). That said, host environments can already control that via HostEnsureCanCompileStrings which the web platform already uses, now via form of CSP. Additionally, there's still value in adding Array.isTemplateLiteral
-like feature for practical reasons. Many applications have already a way of guarding or limiting eval when building the code (e.g. via linting, or statically asserting that only developer-controlled values reach the function), even if it's enabled. The guarantees provided in the language would of course always be 'modulo eval', so to speak, but that can be made clear in the spec itself.
In practice for Trusted Types, they would operate in the environment where disabling eval
is feasible and recommended, but even outside of TT I think Array.isTemplateLiteral
(and similar checks) would add value despite eval
.
- @natashenka wondered how great the uptake of trusted types would be, given that they require a new header. Should it be required for some new features to generate higher adoption?
For the time being, it's unlikely web platform we would guard new features on the Trusted Type enforcement for a realm. As for the uptake, we're on it ;) Types need to be an opt-in feature or otherwise we would 'break the web', but at the same time the plan is to gradually add support in the JS libraries & applications (the API is present and is backwards compatible with the DOM sinks), before enabling the enforcement - this presentation has some details on the approach.
- @jridgewell suggested that we make a stronger guarantee, that the right tag is used. We had a breakout session to consider this idea more deeply, but got stuck on difficulty inspecting the stack when tail call optimization may occur.
ACK, thanks for exploring that.
- @erights expressed concern that keying off of an internal slot violates Proxy transparency. I don't quite understand this concern, as many many things in JS use internal slots, or similar mechanisms (e.g., @erights' proposed mechanism for private fields, now at Stage 3, does not respect Proxy transparency).
However, if you want to go in this direction anyway, it's possible to do without a change from TC39. Layering changes like the one I proposed in tc39/ecma262#1350 are intended to clean things up, but HTML and the Web Platform make wide use of things which are "associated with" JavaScript objects, without "properly" layering the changes into ECMAScript. Multiple TC39 delegates pointed out that the Web Platform doesn't need permission from TC39 to make a change like this; the layering change would just make things cleaner.
I still have the preference for cleaner changes, especially given that I think literalness guarantees would be valuable outside of Trusted Types, and, in general, web platform concerns. + @mikesamuel who might be able to use them for the module keys? But it's valuable to know that there might be a Plan B and have the details here, thanks!
The trusted types specification could use wording like, "If obj is a template object which was produced from the GetTemplateObject operation, ..." to distinguish template objects from other objects. The trusted types specification could package this up into a function which is passed to the trusted types policies.
Please let me know if there's any other way I can help!
- Many pointed out that the literal-ness guarantee means nothing if
eval
is possible. I was assuming it would be possible to disableeval
through CSP, but it was pointed out that correct adoption of CSP has been slow. I agree that it's not useful to think about what's literal if CSP is not enabled.
From a security standpoint, eval(s)
is roughly equivalent to assignment to HTMLScriptElement.text
. TT already constrains the latter to values of TrustedScript
. See https://github.com/WICG/trusted-types/blob/1f6f30111240a055ea843e4d95de6fa3d5134e65/src/enforcer.js#L118
Ideally, eval would be treated the same -- with TT enabled, eval(s)
would require s
to be of type TrustedScript
.
I.e. we don't need to require that eval is disabled completely via CSP. Instead we retain the flexibility to call eval, but only on scripts that are safe per the application's TT policy.
With that in place, reasoning about security essentialy becomes an inductive argument: As long as eval (and all the other script execution sinks) are only ever called with instances of TrustedTypes that satisfy their security type contracts, the application is secure. And as long as the application is secure, instances of TrustedTypes can be assumed to conform to their security contracts.
@jridgewell suggested that we make a stronger guarantee, that the right tag is used. We had a breakout session to consider this idea more deeply, but got stuck on difficulty inspecting the stack when tail call optimization may occur.
My suggestion is that we allow the invoked function to determine how it was invoked, eg (pretend this is all literal source text in some .js
file somewhere):
fn()
, fn
should be able to tell that it was a CallExpression
that invokedfn`template`
, TaggedTemplateLiteral
@fn() class {}
, Decorator
The natural way to do this would be to inspect the current stack frame, provided the frame could be associated with a source location. But the summary is correct, it's difficult when there's TCO.
Fyi, it is possible to do this in user code if you're willing to do terrible things.
The eval test at https://gist.github.com/mikesamuel/24de218b7ba8d7c3d962165061c9c8f3#file-who-is-calling-me-js-L25 will fail in modern browsers (I think).
const getRaw = (strings) => strings.raw;
assert(getRaw`hello` !== getRaw`hello`);
assert(getRaw`hello` !== eval('getRaw`hello`'));
@jridgewell Thanks. I'd forgotten whether strings was hoisted per callsite or per Realm.
@jridgewell I can see how this interacts with TCO, but we don't need full stack introspection. Just an enum available within the scope of the function.
If TCO converts
function add(m, n) {
if (m === 0) { return n; }
return add(m - 1, n + 1);
}
to the structural equivalent of
function add(m, n) {
for (;;) {
if (m === 0) { return n; }
// Overwrite parameters from last call with new values.
let oldm = m, oldn = n;
m = oldm - 1, n = oldn + 1;
}
}
is it not sufficient to set the call type just before jumping back to the top?
function add(m, n) {
for (;;) {
if (m === 0) { return n; }
let oldm = m, oldn = n;
m = oldm - 1, n = oldn + 1;
// Not seriously proposing this syntax
arguments.callType = 'CallExpression';
}
}
I suggested something very similar. I think someone said that it would cause overhead for all function calls, because there's now a new property that has to be added to all arguments
(even if it's not directly exposed to the developer).
@jridgewell Can you recall who might have said that? I'd like to follow up.
I can see how it might affect all calls to functions that mention arguments
, but not all calls unless the call type is not readily apparent via introspection to privileged code.
I was imagining something like:
arguments
which happens before the instruction addr to which any TCO loops back.arguments.callType
.Likely @littledan?
Yeah, it was me. Maybe we should chat about this at TC39; I don't understand how @jridgewell's idea is viable at all.
I wonder if static decorators could help us here. I also wonder if tagged templates are good enough to make a real improvement already.
But I understand that the trusted types proposal has moved on from this initial literal-ness goal to ensuring that user-defined sanitation functions are called, so maybe all the discussion we are having here would be better for a different proposal.
Maybe we should chat about this at TC39
@littledan, That'd be great.
I chatted with @littledan about the threat model here.
Do you (@koto, @xtofian) agree with these assertions:
eval('TrustedURL
' + str + '')
since TT should block non-TrustedScript reaching eval
, and we already assume that attackers don't control TrustedScript values.
IIUC, xtof said as much.Our threat model does not include protecting one trusted developer from another trusted developer.
// A trusted author provides a confusing tag API.
/**
* Returns an ASCII-art picture of a cat that says something in a speech bubble.
*/
function CatPicture(strings) {
use(TrustedURL(strings));
return `🐈<(${ strings[0] })`;
}
Elsewhere
// I, a trusted author, misconstrue what CatPicture does.
alert(CatPicture`javascript:meow()`);
// IIUC, Tricking me doesn't let the author of CatPicture do anything they couldn't.
I think this is what Dan was referring to earlier that we don't need to know which tag the strings are used with.
I think we have a resolution on this thread for the proxy trick.
If those two are DCs then I agree with Dan that a boolean IsTagStringsArray check would be sufficient. Basically, TrustedType when called would check whether its argument is a strings array
@mikesamuel Yes, I agree with (1) (eval'ing expressions that are not constructed through a reviewed TT policy is inherently unsafe).
(2) is a bit more subtle (and TBH I'm afraid I don't follow the example): the correctness/safety of an implementation of a template tag function that involves TT might or might not rely on the actual literalness of the template.
For example, one might define a tagged template to make TrustedURLs to be used like so,
const myURL = trustedURL`https://example.com/${path}?p=${p}`;
where the implementation of trustedURL
simply passes the result of interpolating the string through a sanitizing policy for TrustedURL. It would be perfectly safe to call trustedURL(strings, arg1, arg2)
with strings
whose elements come from an untrusted input, because the result of the interpolation is passed to a sanitizing policy that is safe for any input. In other words, this is really a shorthand for,
const myURL = TrustedTypes.getExposedPolicy('sanitizer').createURL(`https://example.com/${path}?p=${p}`);
and createURL
of an exposed policy must be safe for arbitrary (even completely untrustworthy) inputs.
On the other hand, if we wanted to implement a TrustedTypes-aware version of https://github.com/Polymer/lit-html, to be used like,
el.innerHTML = trustedHTML`<a href="${url}">${linkText}<a>`;
then the implementation of trustedHTML critically relies on the fact that the snippets of template really literals. In other words, the policy that underlies the implementation of trustedHTML should be able to ensure that the template snippets were indeed literals.
The policy that underlies the implementation of trustedHTML should be able to ensure that the template snippets were indeed literals.
The plan is to provide a boolean function that reliably answers "Is this an array of template literal strings?"
I don't know how you're planning to get the strings to a policy, but is that sufficient?
Yes, that would be sufficient. The implementation of trustedHTML would likely use a not-exposed "unchecked-conversion"-style policy, but that's fine.
On Tue, Mar 26, 2019 at 6:14 PM Mike Samuel notifications@github.com wrote:
The policy that underlies the implementation of trustedHTML should be able to ensure that the template snippets were indeed literals.
The plan is to provide a boolean function that reliably answers "Is this an array of template literal strings?"
I don't know how you're planning to get the strings to a policy, but is that sufficient?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WICG/trusted-types/issues/96#issuecomment-476921171, or mute the thread https://github.com/notifications/unsubscribe-auth/AED2joCUpPvZ2e-6Yis3RxMymVlCuqAKks5vasX1gaJpZM4YgPmU .
If by "array of template literal strings" you mean the original Array.isTemplateLiteral
, that would be used inside the tag function to verify the arguments, that is sufficient. I'm afraid I didn't follow the example with the cats (what does the use
function do?).
We could also benefit from String.isLiteral
, to be able to check literalness of standard strings. but that's a completely different issue.
Cool, so it sounds like Array.isTemplateLiteral
would be great to have and a realm bounded String.isLiteral
would also be useful.
Talking with Dan, it sounds like Array.isTemplateLiteral
would be easy to spec.
I imagine it would require a builtin that bottoms out on something like
The abstract operation IsTemplateObject is called with a value, strings, as an argument. It performs the following steps:
I'm not sure how to spec String.isLiteral
since we'd want to make sure that other Realm's can't affect the behavior of String.isLiteral
in the current realm.
If by "array of template literal strings" you mean the original
Array.isTemplateLiteral
, that would be used inside the tag function to verify the arguments, that is sufficient.
Yes, exactly.
We could also benefit from
String.isLiteral
, to be able to check literalness of standard strings. but that's a completely different issue.
If we have literalness for template strings, then String.isLiteral
is more of a nice-to-have, since one can then define a template literal that does nothing but attest to the literalness of the template string. Something like,
functionThatRequiresTrustworthyLiteralString(lit`A literal string`)
lit
would return a type StringLiteral
to represent a string that was constructed from a (template) literal.
(This is basically the same idea as the goog.string.Const
type in Closure, except that construction is constrained to literals using the template literal mechanism instead of a Closure-compiler check on the usage of Const.from
)
This approach might actually be preferable, since it expresses the requirement for a string literal parameter in a function signature by using a straightforward type (rather than some special property of String that can only be asserted at runtime). This would work nicely and out of the box with static type systems for JS (TS, Closure).
I'm afraid I didn't follow the example with the cats (what does the use function do?).
It was a poorly thought out example.
The CatPicture
tag does what its documentation promises, but unknown to its caller, it also creates a trusted value that it uses for non-cat-picture related purposes.
then define a template literal that does nothing but attest to the literalness of the template string
Good point.
Perhaps, when TrustedHTML is called as a tag, it could be such a function.
If we have literalness for template strings, then String.isLiteral is more of a nice-to-have, since one can then define a template literal that does nothing but attest to the literalness of the template string.
Yes, this is exactly the design I was aiming towards! And the lit
tag can be defined within this proposal, if you'd like.
We could also benefit from
String.isLiteral
, to be able to check literalness of standard strings. but that's a completely different issue.If we have literalness for template strings, then
String.isLiteral
is more of a nice-to-have, since one can then define a template literal that does nothing but attest to the literalness of the template string.
What I meant was that it would be possible to use a function in a pre-ES6 code (I believe the majority of the code in the web pages nowadays is still transpiled down to a version that does not support template literals). But that's probably moot anyway - the transpilation would have to polyfill any .isLiteral
function anyway, and the polyfill will have to regress on security.
Yes, this is exactly the design I was aiming towards! And the
lit
tag can be defined within this proposal, if you'd like.
@littledan - defining the lit
tag as part of the proposal would be nice, since then libraries can rely on its presence without having to depend on some other library to define it. Or each library defining their own version of it, which could be confusing.
But if it for some reason turns out to be contentious, it could be dropped since it's not essential.
I expect that defining a tag in this proposal is less likely to be contentious than defining a predicate in TC39.
If an initial goal of this proposal was to restrict usages of trusted types to literal strings, unless an explicit escape hatch were used, I believe this would be possible using a slight template literals.
The syntax at the usage site would be something like,
Using https://github.com/tc39/ecma262/pull/1350, the implementation of TrustedURL would check whether the template object passed into it was a "real" template object present in the program or not. Coupled with CSP, this would prove whether the string came from a tagged template in the author's program (but, it could be that a different tag was originally used).
Now that this proposal has been developed further, is there still interest in checking for literal strings? The new sanitizer policy direction seems great to me, but it seems like proving literal-ness would be a complementary benefit.