Open littledan opened 6 years ago
Hey Daniel! This seems like a reasonable narrowing of the scope in general, and a better explanation of how the thing might work in practice than I had in https://github.com/mikewest/tc39-proposal-literals#strict-tag-functions. :)
That said, I'd very much like to expose the "literal"
-ness of the string in some way to the platform so that DOM could implement some built-in template tags that required literal strings. That is, one goal is to define some sort of fromLiteral
factor for TrustedUrl
objects, such that TrustedUrl.fromLiteral`This is a literal string`;
could be implemented as part of DOM, with the checks enforced in C++. I assume that if we could access the "literal"
property from JS, we'd be able to do so from C++ as well, though I admit I'm not familiar enough with that bit of Blink's bindings to know off the top of my head how we'd do so.
/cc @koto, @slekies
I believe the literal-ness should be just as visible to the platform as it would be to JavaScript, but it might take some more work in WebIDL and the Blink bindings layer to do the plumbing.
I pulled this suggestion into the document wholesale at https://github.com/mikewest/tc39-proposal-literals#strict-tag-functions. Thank you!
I like this variant a lot, FWIW.
So:
'xy' === 'x'+'y'; //'true
'xy'.literal; //true
('x'+'y').literal ; //false
That's highly disturbing, given the usual semantics of ===
Also consider:
let str=eval(`'${"a"+"b"}'`); //equivalent to eval("'ab'");
str.literal; //presumably true, because the parser sees the litera 'ab'
So, a string value is primitively tagged as "a literal" even though it may have been arbitrarily constructed at runtime. I appreciate that @mikewest is thinking of use cases where eval
has probably been disabled. But a semantics of a fundamental ES type (ie string) should be meaningful in all contexts.
@allenwb In what I'm proposing, those would always be undefined
--the literal
property is just on the first argument for tagged templates.
@littledan OK, but the eval observation still stands.
Also, I guess I don't see how the literalString tag (as shown above) actually helps. First of all, by definition a lexical grammar Template is a literal. So, you literalString function will never throw when used as a tag function of a TemplateLiteral, even if substitutions are provided. At the very least you should throw if any substitution values are provided.
If I understand your suggestion, you want libraries (eg, the web platform) to "ask" that calls to certain use literalString to condition string values passed to the API. EG:
someAPICall(literalSring`first arg`, literalString`second arg`);
A library could ask, about how would it enforce that a client actually does it? I guess, the library would have to provide a tag that both internally did the literal test and then returned a privately branded wrapper object and the API call would then reject any non-wrapped values. EG:
someAPICall(someAPILiteralStringBrander`first arg`, someAPILiteralStringBrander`second arg`);
If that's you intent you probably need to clarify it.
If I understand your suggestion, you want libraries (eg, the web platform) to "ask" that calls to certain use literalString to condition string values passed to the API.
If we ran with @littledan's suggestion, then we'd forget all about the first sketchy proposal in the doc. In that case, I would like libraries (e.g. the web platform) to be able to define template tags that required the first argument to be a literal. Coming back to the TrustedUrl
example in the proposal, we'd have some new WebIDL construct that defined a tag function (let's call it tag
, for grins):
interface TrustedUrl {
// Stuff goes here.
static tag fromLiteral(DOMString url);
};
Developers would poke at it as described:
// This returns a `TrustedUrl` object:
var url = TrustedUrl.fromLiteral`https://an.example/`;
// This throws a `TypeError`:
var url = TrustedUrl.fromLiteral(inputElement.value);
This might be what you meant by "I guess, the library would have to provide a tag that both internally did the literal test and then returned a privately branded wrapper object and the API call would then reject any non-wrapped values."? If so, then yes! Exactly that! :)
(Forgive me for focusing on the library side of things: it's all I know.)
@allenwb The idea is that you could use this literal
getter to check whether it was a real template literal or whether it was an object that a user made to look like one (and simply called the template tag as a function). Maybe we should make literal
return false if you provided any substitution. For the eval concern, I imagine that CSP (i.e., HostEnsureCanCompileStrings returning false) may be able to block that eval.
@mikewest
What I meant is the the value return as url
would have to have been given a privately access brand that says it was generated using literalString.
Also while
// This throws a `TypeError`:
var url = TrustedUrl.fromLiteral(inputElement.value);```
would indeed throw, ~I could write a function (using techniques similar to what Dan showed for literalString
) that could be used to by-pass that check:~
~// This won't throws a TypeError:~ ~var url = TrustedUrl.fromLiteral(masqueradeAsLiteral( inputElement.value));~
~If it can be bypassed that simply it's not really security but more on the order of training wheels. Is that good enough? ~
see https://github.com/mikewest/tc39-proposal-literals/issues/2#issuecomment-345755257
I might not be fully understanding what's going on here, but I do think there's at least some confusion where people are not properly separating the internal slots level and the JS API level. Web platform APIs like TrustedURL.fromLiteral
would consult unforgeable internal slots. There's no need for them to be written as Dan has done.
As usual, we can have a getter exposing that, but it'll be forgeable without safe-meta-programming. Conversely, you should not be able to write a masqueradeAsLiteral
function that somehow fools the internal slot check.
@domenic Yes, that's what I meant. I don't see how @allenwb 's examples would evade things.
@allenwb I don't think strings need any more complexity. The technique I was showing wasn't a bypass technique, but rather a technique to have a reliable internal slot checker to be used against the template object.
Ok, I agree that it would be possible in GetTemplateObject to brand all template objects by giving them an internal branding slot. We'd also have to expose a brand check function (Array.isTemplate since template objects are arrays??).
That should be sufficient for library code to unforgably test whether an object is a "real" template object created by evaluating a TemplateLiteral.
Why would you need the second function? The own getter which checks for a literal brand would be insufficient?
It's mostly a difference in ugliness. Compare (all other things being equivalent):
let getLiteral = Object.getOwnPropertyDescriptor((_ => _)``, "literal").get;
function literalString(strings, ...keys) {
if (!getLiteral.call(strings)) throw new Error();
return String.raw(strings, ...keys);
}
to
function literalString(strings, ...keys) {
if (!Array.isTemplateLiteral(strings)) throw new Error();
return String.raw(strings, ...keys);
}
Why define something in a way that demands use of such an ugly, error prone idiom?
I agree that Array.isTemplateLiteral
is prettier. As long as it's also unforgeable, that sounds like a fine approach to me.
Oh, I see. I wasn't really designing for prettiest, just something simple that would work reliably and be easy to implement. However, it comes with the major pifall that you could accidentally use the literal getter directly, even as that is forgable.
Unfortunately, Array.isTemplateLiteral is forgable by either replacing that function with another one or by replacing Array on the global object.
Safe metaprogramming is never going to be pretty. Everything in this thread has so far been susceptible to forgery, sometimes by replacing Function.prototype.call. We shouldn't worry about the prettiness when trying to be unforgeable. The platform will just brand-check, and people working in a hostile environment that need safe metaprogramming will write ugly code. People working in a non-hostile environment should get a reasonable syntax, so I prefer the getter for the surface syntax.
Array.isTemplateLiteral is no more or less forgabe then any other built-in. Early capture is your friend.
It's also easy to validate if you are using a good Array.isTemplateLiteral:
const isLit = Array.isTemplateLiteral;
const nonlit = [""]; nonlit.raw=Object.freeze([""]);
Object.freeze(nonlit);
if (!isLit`` || isLit(nonLit)) throw Error("bogus Array.isTemplateLiteral");
or you could provide a unforgable platform specific version of Array.isTemplateLiteral
or the isTemplateLiteral function could be made available using an built in module
The problem with the original suggestion of a literal
getter on template objects is that it is an attractive nuisance for usage that is susceptible to forged templates:
const maybeLit= {get literal() {return true};
if (maybeLit.literal) {
//ok we "know it is a template literal (or we...)
@domenic Oh, good point. Well, that could be addressed by instead having the literal function be an own data property, and have the function take a template as an argument rather than receiver. Then you avoid Object.getOwnPropertyDescriptor and Function.prototype.call. Maybe this would also help with the attractive nuisance problem which @allenwb and I mentioned above a little bit.
As previous mentioned, this whole style of testing is only valid if eval
is not available. Any platform that is able to unilaterally decide that eval
is unavailable should be able to also make Array.isTemplateLateral unforgable.
As I tried to explain above, I'm not worries about attractive nuisances. If you're looking for good enough, use .literal (getter is fine, no need for data property). If you're looking for safe meta-programming, save functions and write ugly code. (You can save an "uncurry-this"ed version of the getter.) If you're the platform, use internal slots.
As such I think we should use a surface syntax which reflects the programming model. It's a property of the template object whether or not it's from a literal. Thus a getter on the template object seems best, instead of some Array-wide predicate.
Putting the forgeability and eval
issues aside, how big of a problem would it be to be able to check for "literalness" of the standard, raw strings, not just string literals?
For example, assuming Array.isTemplateLiteral
is not forged, can we have:
let lit = 'string'
Array.isTemplateLiteral([lit]) // true
let notlit = 'string' + Math.random()
Array.isTemplateLiteral([notlit]) // false
The reason I'm asking is that one important use case fro literalness is to be able to distinguish the provenance of a String
in web applications. If the string was created as a literal, it's a strong signal that the string is wholly author-controlled - if it's not, it might have been constructed of user-controlled source, which may cause a DOM XSS vulnerability, when used with some DOM XSS sink functions/properties like Element.innerHTML
.
Beign able to "trust" raw string literals (not template literals only) allows us to create a more secure DOM API, while at the same time significantly reduce its adoption costs, as most of the existing calls already use (presumed safe) literals. So, for example, we could have:
anEl.innerHTML = 'https://an.example/'; // does not throw
enEl.innerHTML = inputElement.value; // throws
which does not require rewriting all of the codebase.
Would it be a problem to ask library users to use template string syntax, rather than literal strings? Could WebIDL expose template tags? If so, maybe this would be easier to specify and implement as a property of templates, rather than of strings.
Here's the API surface I'm imagining: the first argument to a template tag would have an additional property,
literal
, which indicates whether the string provided as an argument to the template was a literal. We could make this unforgeable by representing this as an internal slot, withliteral
as an own getter, so you could do something like this to prevent an attacker from calling your template with any old object that has aliteral: true
property:This
literalString
template tag is likeString.raw
but would throw an exception if you don't pass a literal. It outputs a normal string, with no trace any more that it was literal. This two-liner could be the recommended way to set off a call to something that requires a literal string. It's not possible to compromise the string literal contents because thestrings
object (and its innerraw
object) are frozen.