Closed lukewarlow closed 10 months ago
There's also currently spec issues with this functionality see https://github.com/w3c/trusted-types/issues/374
At salesforce, we are ok deferring fromLiteral
.
Wondering if deferring fromLiteral
would prevent web-developers from enforcing to forbid trusted type policies.
That's because Content-Security-Policy: trusted-types 'none'; require-trusted-types-for 'script'
allows the usage of fromLiteral
. Hence simplifying to transition away from trusted types policies to non-XSS APIs instead. The latter is mentioned as a goal in https://github.com/mozilla/standards-positions/issues/20#issuecomment-1783279722.
Here, experience from people having actually applied TT could be beneficial.
I think it's definitely a valuable addition to the API in terms of simplifying certain code paths. But I think given Google and other's success without it (it's not shipped in Chrome) + the implementation difficulties + lack of polyfillability (which I think is quite important) it's probably worth defering.
I agree the end goal is to just do trusted-types;
(or trusted-types 'none'
) and just not using XSS APIs. I think the reality of that will depend on the santizer API and whether it effectively counts as a trusted sink. I'm unaware how else you'd use web components for example without having at least some XSS sink usage.
I think it's definitely a valuable addition to the API in terms of simplifying certain code paths. But I think given Google and other's success without it (it's not shipped in Chrome) + the implementation difficulties + lack of polyfillability (which I think is quite important) it's probably worth defering.
Makes sense.
I agree the end goal is to just do
trusted-types;
(ortrusted-types 'none'
) and just not using XSS APIs. I think the reality of that will depend on the santizer API and whether it effectively counts as a trusted sink.
The API of a "trusted sink" is unclear to me. Presumably it's meant that the Sanitizer API produces trusted types. Anyway, don't need to discuss it in this ticket.
I'm unaware how else you'd use web components for example without having at least some XSS sink usage.
Here, experience from people having actually applied TT could be beneficial.
In our experience we quite often encountered a need to generate a constant TT value (most often it was a short HTML snippet, or a constant script URL). We also encountered some JS build tool that outputted code wrapped in a series of eval(a constant string)
calls.
To make sure these are TT compliant, we have used the quite verbose TT policy API (createPolicy
+ createXYZ
calls).
Some of these cases can be refactored not use DOM XSS sinks (e.g. for HTML we could use a series of createElement
, appendChild
), but in some cases it proved non-trivial (e.g. changing the JS compilation steps was very involved) or impossible (e.g. the application absolutely needs loading a script dynamically, from one of the passed constant string values). fromLiteral
is a result of that - had it been available, our migrations would have been much, much simpler. I suspect this will be the case even more when dealing with 3rd party libraries -- getting a patch adding fromLiteral
is much easier than one that creates intermediary policy objects only to use their output values once.
Paraphrasing, TT with fromLiteral
is much more useful to the authors than without - we indeed have completed many migrations without fromLiteral
but would have much preferred if it was available at that time.
What is materially important here is the implementation difficulty (as the spec part is trivial) - are there some estimates of it for non-v8 engines? @otherdaniel, I believe we actually are on the verge of shipping it, is that correct?
are there some estimates of it for non-v8 engines?
I can't speak for Gecko but in WebKit the Ecmascript feature this spec relies on ([TemplateMap] slot in realms) is unimplemented. So it would require changes to JavaScriptCore I'm unable to give an estimate on how long that would take.
As for Chrome I can see that the main issue https://bugs.chromium.org/p/chromium/issues/detail?id=1271149 hasn't been materially changed for over a year now. It's marked as blocked on https://bugs.chromium.org/p/v8/issues/detail?id=13324 which similarly hasn't been touched in over a year.
I can't speak for Gecko but in WebKit the Ecmascript feature this spec relies on ([TemplateMap] slot in realms) is unimplemented.
Wouldn't that be observable? I'm not very familiar with the ES algorithms, but the early return in https://tc39.es/ecma262/#sec-gettemplateobject suggests that a template map is necessary to prevent observable differences between implementations.
I believe this is the implementation of that algorithm in JavaScriptCore https://searchfox.org/wubkat/source/Source/JavaScriptCore/runtime/JSTemplateObjectDescriptor.cpp#58
Edit: It's actually here https://searchfox.org/wubkat/source/Source/JavaScriptCore/runtime/ScriptExecutable.cpp#430
From the ES Spec point of view, I always thought this was going to be a problem if we use the same mechanism that we use TemplateMap
, and even if we use some other mechanism, I'm not very optimistic about it. Few notes:
[[TemplateMap]]
also include dynamic values (the array associate to it), which cannot be supported in TT.[[TemplateMap]]
is used during parsing (via GetTemplateObject
) when a template literal is encountered, but fromLiteral
should support regular strings as well.fromLiteral
with indirect eval? if so, then how do you know when to map those values into the internal slot? that is unclear to me.Okay so after some further investigation JSC does have a template map so that could probably be used, though the spec as is is o(n) which isn't ideal.
Alternatively if we used https://tc39.es/proposal-array-is-template-object/ it would be o(1).
I have an implementation of that (the internal slots bit not the exposed static function) and using that makes fromLiteral
trivial to implement.
@caridy
1 [[TemplateMap]] also include dynamic values (the array associate to it), which cannot be supported in TT.
I don't fully understand what you mean here? The check to make sure there's no variables is separate to the check for a valid template.
2 [[TemplateMap]] is used during parsing (via GetTemplateObject) when a template literal is encountered, but fromLiteral should support regular strings as well.
It can't support normal strings as there's no way to guartunee they're static. fromLiteral can only work with template literals.
3 what about indirect eval? can you use fromLiteral with indirect eval? if so, then how do you know when to map those values into the internal slot? that is unclear to me. 4 To support indirect eval, you will have to map every string during parsing, and every template literal without interpolation, which will probably have some performance implication.
I dont fully understand what this bit is talking about enough to comment.
From the ES Spec point of view, I always thought this was going to be a problem if we use the same mechanism that we use
TemplateMap
, and even if we use some other mechanism, I'm not very optimistic about it. Few notes:
[[TemplateMap]]
also include dynamic values (the array associate to it), which cannot be supported in TT.
https://w3c.github.io/trusted-types/dist/spec/#create-a-trusted-type-from-literal-algorithm step 3 should reject the templates with dynamic values (actually, with any interpolation).
[[TemplateMap]]
is used during parsing (viaGetTemplateObject
) when a template literal is encountered, butfromLiteral
should support regular strings as well.
I'm not sure I understand. The intention is:
TrustedScript.fromLiteral`foo` // works
TrustedScript.fromLieral`foo${''}` // fails
TrustedScript.fromLiteral(..ANYTHING...) // fails
Is the current TT spec not written to that effect?
For reference here's my prototype implementation of this based on https://tc39.es/proposal-array-is-template-object/ for the check templatedness aspect.
https://github.com/lukewarlow/WebKit/commit/f24c4837e66fd6cd205e77848ca65a2c152d0a9f
I was assuming that regular strings were going to be supported based on previous examples, e.g. : eval(TrustedTypes.fromLiteral("1"))
.
I don't know how I feel about this. I will socialize the idea with some folks, but we put a lot of efforts to make template literals to match the semantics of strings, and can be used interchangeable. As far as I can tell, so far we have no API that distinguishes between them, e.g. : foo
=== "foo". To be more specific, this seems very very odd:
eval(TrustedTypes.fromLiteral("1")); // throws or no-op
eval(TrustedTypes.fromLiteral(`1`)); // works
/cc @erights
I was assuming that regular strings were going to be supported based on previous examples
The trouble as I understand it is that the code can't tell the difference between: eval(TrustedTypes.fromLiteral("1"))
and eval(TrustedTypes.fromLiteral("1" + someRandomXssVector))
, while it can tell the difference between
eval(TrustedTypes.fromLiteral(`1`))
and
eval(TrustedTypes.fromLiteral(`1` + someRandomXssVector))
(as the second example is no longer a template string, but just a plain string).
(If I've understood correctly).
@caridy, I'll overcommunicate in case there are misunderstandings. There are 3 static functions (TrustedHTML.fromLiteral
, TrustedScript.fromLiteral
, TrustedScriptURL.fromLiteral
), but for simplicity we can just say fromLiteral
.
// your example
eval(fromLiteral("1"));
eval(fromLiteral(`1`));
Both are the same, since you're calling fromLiteral
as a regular function and its arguments are stringified before the function executes. fromLiteral would throw in both cases. The only way for fromLiteral
not to throw is if it would be called using the fromLiteral`a_string`
syntax, or:
function otherTag(ts) {
return ts
}
fromLiteral(otherTag`foo`)
The property we're after is really that a TemplateStringsArray
was created from syntax, and not by manually constructing an array.
As I understand it the code example below is the thing that got hung up on in tc39 preventing it (the proposal I mentioned above) moving to stage 3? This working is absolutely fine for trusted types and I imagine any other use cases for Array.isTemplateObject do we think presenting this as such would alleviate concerns in tc39?
The only way to spoof this would be using eval which csp and trusted types deals with?
function otherTag(ts) {
return ts
}
fromLiteral(otherTag`foo`)
https://github.com/tc39/proposal-array-is-template-object/issues/10 has the most context, I think, but I believe they were a bit deeper than what we discuss here. The concerns might be alleviated if there is no Array.isTemplateObject
function exposed, which is what we in the end did in fromLiteral
, with the new approach using only the [[TemplateMap]]
slot from ECMAscript in host algos without needing Array
function.
cc @syg
To continue over-communicating. Supporting
function otherTag(ts) {
return ts
}
fromLiteral(otherTag`foo`)
seems unnecessary for web-developers and there's no benefit of using that instead of the shorter:
fromLiteral`foo`
However, supporting the former has no security impact.
Supporting [tag smuggling] seems unneccessary [..] but has no security impact
Correct, that's simply the only way we could identify to support this using the existing spec wiring, so it's "good enough" for our case.
I can see tag smuggling being useful if you've got your own tagged templates such as say lit htmls html tag.
You could just directly forward to fromLiteral and then fallback to sanitising etc and using policies if it's not a literal.
Postponing this issue to v2. In case of concerns, please raise them.
Again, at this point for me it is not really about whether or not it will work, it is about whether or not this is a good model. At TC39, we often look at features from the lens of developers, I like to use the ZPD (Zone of Proximal Development), and I can't see how this will be learned by developers, or how this new knowledge can be transferred to any feature in the future. It feels that we are bending backward or doing gymnastics to try to reuse existing features of the language for the propose of signing strings. What are the alternatives? What about new syntax? What about defer evaluation or module blocks? If the content to be evaluated must be static in order to be considered safe, how is that different from other features that we are exploring to do defer evaluation (proposed by Mozilla)? Just food for thoughts.
That's very valuable context, thanks @caridy.
This all has to be weighted against the actual use case though. The use case is specific to the web platform, as DOM XSS, especially in existing websites is its unique problem. Fixing XSS by requiring to rewrite an application from scratch fails (case in point: unsafe-eval
didn't eliminate eval's usage). Guarding dynamic code execution that is more precise than unsafe-eval
and that doesn't require to throw away eval is a necessity today, and TT have a track record showing that this approach works for the existing websites - more than any other approach tried in the past.
Worth noting, fromLiteral
seems to be both easy in implementation and solves a real problem for web platform, facilitating migrating to safe APIs. From what I understand, the argument isn't even about fromLiteral
per se, but dynamic JS code evaluation hooks, no? Even without fromLiteral
, TT support eval(TrustedScript)
when that value is created through TT policies. To JS it shouldn't matter how a value to be executed was created, because it's the hosts concern to block or allow eval payloads. In that sense, fromLiteral
is just an optimization over web platform's TT policy-based API.
Here is my opinion as a person who deployed Trusted Types outside of Google.
If anyone tries to deploy Trusted Types, they would need a library equivalent to fromLiteral
. This is because there are so many constant string assignments to XSS sinks, that it would be much easier to create a library for it than to add those constant strings one-by-one to a custom TT policy.
However, that creates a problem because developer can bypass the library if they figure out how to bypass the check. This is not a problem for Google, because they have compile-time checks for those unsafe practices, but I'm sure most other companies don't.
This is why I belieave we should have fromLiteral
in the Web Platform, because I don't think all sites that what to deploy Trusted Types has as many resources as Google to build compile-time checks for unsafe practices. Because if they had that much resources, they don't need Trusted Types (i.e. Google had Trusted Types equivalent internally before Trusted Types was available in the Web Platform).
@koto
This all has to be weighted against the actual use case though. The use case is specific to the web platform, as DOM XSS, especially in existing websites is its unique problem. Fixing XSS by requiring to rewrite an application from scratch fails (case in point:
unsafe-eval
didn't eliminate eval's usage).
I hope we can generalize this. I'm still trying to understand the different use-cases, but from what I know so far, a generalization of this seems possible that can be somehow lifted to the language with syntax, and useful for all runtimes.
Worth noting,
fromLiteral
seems to be both easy in implementation and solves a real problem for web platform, facilitating migrating to safe APIs. From what I understand, the argument isn't even aboutfromLiteral
per se, but dynamic JS code evaluation hooks, no?
No doubt that in its current form, it is easy to implement, and it is going to be useful right away, that's not the concern. Having the right JS code evaluation hooks is the goal, yes, but my argument is that using template literals or tag functions, or anything that exists today in our syntax is the wrong approach, and will bite us back fairly quick. The fact that we are using the api + syntax as the signaling mechanism to provide the right hooks is concerning, specially because this is, as far as I can tell, the first time we are doing that, and you're not allowing all equivalent syntax, but a very particular syntax with a particular api.
Worth noting, fromLiteral seems to be both easy in implementation and solves a real problem for web platform, facilitating migrating to safe APIs.
@koto btw to provide some clarity my "issue" with fromLiteral
is purely around the implementation "difficulties", I personally believe it to be a valuable aspect of the spec and if not included in v1 should be a fast follow up.
I have a prototype for WebKit (https://github.com/webkit/webkit/compare/lukewarlow:from-literal) which at first glance appears to work but it's using a different implementation not based on TemplateMap. This is for 2 reasons efficiency (as specced this isn't an efficient check) and ease of implementation (the TemplateMap slot within JSC isn't necessarily easy to access within WebCore).
My implementation is largely based on https://tc39.es/proposal-array-is-template-object/ but without exposing the IsTemplateObject method itself.
Provided this implementation is identical to Chromium's from an end user code perspective (stuff such as cross realm checks) AND it's implementable in Gecko, I'd be comfortable keeping fromLiteral
in the spec.
I've done an initial check (passing the tagged template parameter from an iframe through postMessage and it fails to work in both Chromium and WebKit, I'm unsure if there's other edge cases where this would be an observably different implementation.
@mbrodesser-Igalia can you speak to whether this approach will work in Gecko?
To provide an update I've done some testing of cross-realm behaviour and it is indeed different to chromiums.
Specifically console.error('TrustedHTML.fromLiteral doesnt throw an error as expected');
logs an error in my WebKit build but NOT in chromium.
This is an observable difference, I'm not sure how bad the difference is for our use case.
One alternative idea was rather than doing array.[[IsTemplateObject]] do array.[[TemplateObjectOriginalRealm]] and do a realm check too. Array.isArray provides a same realm check but I'm not sure how that works in C++ land.
const iframe = document.createElement('iframe');
iframe.srcdoc = `
<script>
window.templateArray = (x => x)\`\`;
</script>
`;
iframe.onload = () => {
const { templateArray, TrustedHTML: TrustedHTMLFromRealm } = iframe.contentWindow;
if (Array.isArray(templateArray) && !(templateArray instanceof Array)) {
console.log('templateArray is an array from a different realm');
} else {
throw new Error('templateArray is not an array from a different realm');
}
try
{
TrustedHTML.fromLiteral(templateArray);
console.error('TrustedHTML.fromLiteral doesnt throw an error as expected');
} catch (e) {
console.log('TrustedHTML.fromLiteral throws an error as expected');
}
try
{
TrustedHTMLFromRealm.fromLiteral(templateArray);
console.log('TrustedHTMLFromRealm.fromLiteral doesnt throw an error as expected');
} catch (e) {
throw new Error('TrustedHTMLFromRealm.fromLiteral throws an error');
}
}
document.body.appendChild(iframe);
Worth noting, fromLiteral seems to be both easy in implementation and solves a real problem for web platform, facilitating migrating to safe APIs.
@koto btw to provide some clarity my "issue" with
fromLiteral
is purely around the implementation "difficulties", I personally [...] @mbrodesser-Igalia can you speak to whether this approach will work in Gecko?
Don't know yet.
Should we be working on Array.isTemplateLiteral in TC39? It isn’t stalled due to any particular reason, just no one is pushing it forward. It was kinda blocking on interest from trusted types folks, which there seems to be.
If they're interested, they should push it forward?
@littledan I for one think it would be a good thing to try and progress.
Array.isTemplateLiteral did not move forward in TC39 due to concerns raised by Mozilla, as expressed by @dminor. I'm honestly neutral on whether we include the JavaScript-level API, and mostly wanted to unblock fromLiteral. I'd be interested to hear more about Mozilla's views here.
Array.isTemplateLiteral did not move forward in TC39 due to concerns raised by Mozilla, as expressed by @dminor. I'm honestly neutral on whether we include the JavaScript-level API, and mostly wanted to unblock fromLiteral. I'd be interested to hear more about Mozilla's views here.
Although we agree it may be useful to have an abstraction like Array.isTemplateLiteral
in JavaScript, it's not clear to us if it is the correct abstraction to use or not, and it's not currently a priority for us to investigate further.
I think it would be best to not consider Array.isTemplateLiteral
a blocker or prerequisite for fromLiteral
, as there seem to be ways ahead without it.
We'd need the underlying mechanics that isTemplateLiteral used. But as you say we don't need the actual exposed JS function for our use case. Would it be possible to add those without any exposed use for Trusted Types to build off of? Or would we need to do some degree of patching in this spec?
While the hypothetical use case for
fromLiteral
is explained it's unclear if that actually translates into real world usage of this API.It also suffers from implementability issues,
the underlying TemplateMap functionality isn't implemented inside of JavaScriptCoreedit: it is implemented in JSC but SpiderMonkey apparently has an alternative mechanism. It never shipped inside of Chromium despite having an intent to ship and this seems to be as a result of technical difficulties in the implementation. I imagine similar issues may exist in Gecko.There's also no way to properly polyfill this without exposing a potential bypass (no userland way to detect a true template literal).
With all this in mind it would make sense to defer this to v2 of the spec, that way there's more time for something like https://tc39.es/proposal-array-is-template-object/ to progress to make both polyfilling and implementation easier.
cc @mbrodesser-Igalia @bkardell