w3c / trusted-types

A browser API to prevent DOM-Based Cross Site Scripting in modern web applications.
https://w3c.github.io/trusted-types/dist/spec/
Other
606 stars 74 forks source link

Figure out Function constructor + TrustedScript #174

Closed koto closed 5 years ago

koto commented 5 years ago

First n-1 parameters to Function constructor form the argument lists, but still have the capability of executing code due to a weird way how a function is constructed (string concatenation, essentially):

https://www.ecma-international.org/ecma-262/9.0/index.html#sec-createdynamicfunction

It would be best if all of the function parameters were passed to a single (default) policy createScript function, instead of separate invocations for each parameter, but that means the createScript function signature changes (argument list and the meaning of the return value).

cc @otherdaniel @mikesamuel

otherdaniel commented 5 years ago

After reading the spec...

I think my original proposal (checking only the function body) is simply not sane.

Likewise, I think checking parameters separately is effectively useless, since it'd be super easy to 'hide' things by splitting functionality between different bits.

My current thinking is to pass the entire array of arguments to the callback, which could work.

However, after having re-read the spec on this... maybe all of this is too complicated. The spec specifies how to assemble all of the args into a single source string, which means just not special casing this at all would seem to work just fine:

koto commented 5 years ago

I.e., do the trusted type check at or around step 18 of CreateDynamicFunction, with possibly some custom logic that if it's a single argument which is a 'safe' type, to then use that instead.

So, have a single parameter type check early in the algorithm, and if it passes, the stringified function is interpreted as bodyText, with the assembly steps skipped?

They can just assemble the source string themselves and pass that through a policy.

That would require changing the Function constructor to support single-parameter function body with arguments calls (but since it's typed, that should work).

There is some JS spec machinery here to build, so deferring to Mike on how easy it is to add to the spec. If we have @@evalable, perhaps it's an easy task in the end.

mikesamuel commented 5 years ago

We can call the default policy with the fully assembled string. (I had previously assumed that the source string assembly was an implementation detail one wouldn't expose to the developer, since every browser might to this differently, but in reality this is all fully spec-ed.)

How does this work when body is a TrustedScript and the arguments are strings?

Yeah, I think there are two options:

  1. pass the concatenated argument list (P) as one string and the body as a pre-stringified value.
  2. pass the parsed form of P as a |FormalArgumentList| AST and the body as a pre-stringified value. and then define a "safe" argument list.

Neither of these need to affect the order of stringification in CreateDynamicFunction, but the second might affect whether something like Function('a a', '') results in a TypeError or a SyntaxError which might raise web-compat issues.

I don't think this affects generality though. Just how easy it is to express complicated list arguments. Obviously, any call that passes arguments like

new Function(x, policy.createScript(body))

can be rewritten to one that does not

TrustedTypes.isScript(body) && new Function(
    policy.createScript(`return function (${ x }) { ${ body } }`)

which means that different tolerances for argument safety can be specified in user code.

@otherdaniel Should we go through the dynamic semantics of FormalArgumentList and figure out what's a good tradeoff?

otherdaniel commented 5 years ago

@koto @mikesamuel Honestly, I feel like I'm out of my depth, here, since every time I re-read the spec I discover some reason why my previous argument doesn't quite hold...

@koto: "have a single parameter type check early in the algorithm" Yeah. I'm no longer sure that's best, though.

@mikesamuel: "Should we go through the dynamic semantics [...]" Yes, we should. I'd prefer to have a list of options, though, so we can present that to V8 team for fact-checking. Or maybe just get Toon in there.


1 My idea so far has been to basically not support Trusted Types for any case other than the single argument case, and to then tell the developer(s) that if they must use the Function constructor, then they must build a single string. I (now) see that they can only do that by indirectly returning a function, e.g.:

Function('a', 'b', 'return a+b;') => Function("return function(a,b) { return a+b; }");

That is kind of awkward. Also, if that was our choice, it's interesting what the 'default' policy should receive as an argument.


2 I'd be slightly less awkward if we could reduce everything into the two-argument case, because the spec distribguishes between the combined parameter list P and the bodyText. Also, P is trivially constructable from the arguments. I don't see any good way to retrofit that into our current interfaces. (But: See #4.)


3 Maybe one could go back to only checking the body, and demand as a pre-condition for trusted types that IsSimpleParameterList is true (as demanded for 'use strict;'). However, I fear this would be too restrictive for authors trying to move to TT from a legacy codebase. (It's really the same issue as #1 above, since it'd force people to use the indirect function definition thing.


4: A better (bug "bigger") solution might be have a separate 'trusted type' just for this case. And in this case, one that doesn't wrap a string, but would instead wrap an array of any. (Or: Two separate strings, to pick up case #2 again.)

The Function constructor could then (via the embedder) check for that type and - if found - unwrap the argument list from the trusted type and proceed as normal. This might be the cleanest to specify, but arguably more work to implement.

E.g., in addition to createHTML, createURL, createScript, createScriptURL, there's a createFunction(...args), which returns a TrustedFunction. TrustedFunction isn't string-ifiable, and the Function constructor is the only thing reading it.

mikesamuel commented 5 years ago

Re #2 do the below fit the two argument case?

new Function('a, b, c', 'return a+b+c')
new Function(myPolicy.createScript('a, b, c'), myPolicy.createScript('return a+b+c'))

I like the idea of using IsSimpleParameterList since that'll be more palatable for the committee.

Does the following flow fit #2?

  1. If the count of args is 0 or 1, let P = the empty string.
  2. Else if the count of args is 2, let P = the zeroth argument. Do not stringify.
  3. Else, let P = the result of stringifying args[:-1] and joining on ","
  4. Let body = args[args.length - 1].
  5. Pass P and body to the host callout.

The host callout does:

  1. if P is not a TrustedScript
    1. Let P be ToString(P)
    2. Let fpl = the result of parsing P as a FormalParameterList.
    3. If there as a SyntaxError, throw
    4. If ! IsSimpleParameterList(fpl)
      1. TODO. throw or apply default policy as createScript
  2. body checks
mikesamuel commented 5 years ago

E.g., in addition to createHTML, createURL, createScript, createScriptURL, there's a createFunction(...args), which returns a TrustedFunction. TrustedFunction isn't string-ifiable, and the Function constructor is the only thing reading it.

Would a single suffice work for all the Function constructors: Function, AsyncFunction, etc. ?

mikesamuel commented 5 years ago

@koto As discussed, here's a user-library class that can stand in for all arguments:

class C {
  *[Symbol.iterator]() {
    for (let a of [...this.args, this.body]) { yield a }
  }

  constructor(...args) {
    this.args = args.slice(0, args.length - 1); this.body = args[args.length - 1] || '';
  }

  toString() {
    return `function (${ this.args.join(', ') }\n) { ${ this.body } }`;
  }
}
//undefined
const c = new C('a', 'b, c', 'return [a, b, c]')
//undefined
Array.from(c)
// (3) ["a", "b, c", "return [a, b, c]"]
new Function(...c)
// ƒ anonymous(a,b, c
// ) {
// return [a, b, c]
// }
koto commented 5 years ago

That class could serve as a basis for the new type (TrustedFunctionConstructorArguments :) ). It would wrap over multiple strings essentially, last of which is a function body.

With that iterable type instance in place, one would call a Function constructor, spreading the arguments.

The default policy is also accounted for - at the right place, its createFunctionConstructorArguments is being called with the original string (or mixed string and typed) arguments, returning the iterator (or an array of TrustedScripts, whichever is easier to spec.

otherdaniel commented 5 years ago

"Does the following flow fit #2?" Yes, that's the idea. I'm not sure if specifying works that way. I guess we'd need an algorithm that does something like:

And in CreateDynamicFunction, do someting like:

On second thought, I do think a new 'trusted function' type that wraps all the arguments is a more understandable way. In that case, there could be one check for the trusted type early on, and then the entire procedure could continue undisturbed.

mikesamuel commented 5 years ago

Is this a fair statement of requirements:

  1. We want new Function(...simpleParameterStrings, trustedBody) to work
  2. We want new Function(...simpleParameterStrings, stringBody) to work out of the box with a default policy callout for stringBody.
  3. We want a way for clients willing to change code to migrate new Function(...complexParameters, trustedBody)
  4. It'd be nice if new Function(...mixOfStringAndTrustedParameters, anyBody) worked but not required.

and similarly for {Async,}{Generator,}Function?

mikesamuel commented 5 years ago

Is the idea that

  1. CreateDynamicFunction passes arguments to the host callout before any stringification, concatenation, or FormalArgumentList validity checks.
  2. The host callout returns an argument list
  3. The host callout has 2 new normative notes:
    1. If Host... stringifies argument[i] it MUST stringify all of arguments[0:i+1] in left-to-right order
    2. If Host... receives one argument which is to be parsed as a ScriptBody, it MUST return an argument list of length 1. (To not require eval to do complicated error recovery after the host callout).
  4. After the host callout, CreateDynamicFunction stringifies the returned arguments and proceeds.
  5. Browser's implementation of the host callout can use the default policy to convert args to a TrustedFunctionBitsAndBobs if arguments is not [...simpleParameters, TrustedScript] or [TrustedFunctionBitsAndBobs].
koto commented 5 years ago

I think that algorithm augmentation would work. Thanks Daniel and Mike for noticing and hashing that out.

On Thu, May 30, 2019, 10:54 Mike Samuel notifications@github.com wrote:

Is the idea that

  1. CreateDynamicFunction passes arguments to the host callout before any stringification, concatenation, or FormalArgumentList validity checks.
  2. The host callout returns an argument list
  3. The host callout has 2 new normative notes:
    1. If Host... stringifies argument[i] it MUST stringify all of arguments[0:i+1] in left-to-right order
    2. If Host... receives one argument which is to be parsed as a ScriptBody, it MUST return an argument list of length 1. (To not require eval to do complicated error recovery after the host callout).
  4. After the host callout, CreateDynamicFunction stringifies the returned arguments and proceeds.
  5. Browser's implementation of the host callout can use the default policy to convert args to a TrustedFunctionBitsAndBobs if arguments is not [...simpleParameters, TrustedScript] or [ TrustedFunctionBitsAndBobs].

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WICG/trusted-types/issues/174?email_source=notifications&email_token=AAA7JK6UKSFBO6YDETE76NDPX6B2NA5CNFSM4HPIQPUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWRUWWY#issuecomment-497240923, or mute the thread https://github.com/notifications/unsubscribe-auth/AAA7JK2UAOGKGCGMTTHMAEDPX6B2NANCNFSM4HPIQPUA .

mikesamuel commented 5 years ago

@koto, IIUC, Toon noticed this problem. I missed it entirely.

New proposal text at https://mikesamuel.github.io/proposal-hostensurecancompilestrings-passthru/ and I adjusted the explainer.