endojs / endo

Endo is a distributed secure JavaScript sandbox, based on SES
Apache License 2.0
831 stars 72 forks source link

Evasive transform for apparent HTML comments in quoted strings #1217

Open kriskowal opened 2 years ago

kriskowal commented 2 years ago

bundleSource contains “evasive transforms”, that is, it transforms JavaScript modules into JavaScript modules that can get past the SES censorship regular expressions. The soundness of this design is that the SES shim runtime can’t afford to carry a full JavaScript parser, and assumes significantly less risk by not including a JS parser in its trusted compute base (TCB). However, the build step for an archive or bundle can afford to perform transformations that allow code to evade the censor with full awareness of the syntax tree.

One additional transform we could include would allow HTML comments to appear inside JavaScript strings, as described by @naugtur in https://github.com/endojs/endo/issues/1207.

"<!--" is safe, but censored.

"<!" + "--" is equivalent, safe, and not censored.

The same can be achieved in template literals, as in:

const before = `<!--`;
const after = `${'<'}!--`;
erights commented 2 years ago

All looks good. But beware tagged template literals. The same transformation for them would not preserve meaning.

naugtur commented 2 years ago

Is there a mitigation that'd work with tagged template literals? My previous fix for avoiding string censorship was to replace import( with import\( which is also fine in all strings except tagged template literals. That is - if someone uses the raw string.

gibson042 commented 2 years ago

I don't think so, because escape sequences are observable by the function being called:

(strings => {
  const [raw] = strings.raw, [cooked] = strings;
  console.log(JSON.stringify({raw, cooked}));
  return cooked === raw;
})`<!-- \x3C!--`;
// returns: false
// logs: {"raw":"<!-- \\x3C!--","cooked":"<!-- <!--"}
gibson042 commented 2 years ago

But it is possible to translate a tagged template into a call with arguments by constructing an array of the raw strings, an analogous array of cooked strings, assign the former to the "raw" property of the latter, freeze both, and using it as the first argument and each placeholder expression as a subsequent argument (fn`str\0${expr0}…`fn($ObjectFreeze($ObjectAssign(["str\0", …], {raw: ["str\\0", …]}), expr0, …)).

EDIT: Subject to the caveat that each such first-argument object must be constructed exactly once per tagged template site rather than inline as I have above (e.g., a tagged template in a loop body observably receives exactly the same first argument every time).

erights commented 2 years ago
foo`<!-- ${x}` // original
foo(...RESTORE_DASH`<!${DASH}- ${x}`) // rewritten
gibson042 commented 2 years ago

Hmm, like this (modulo safety around reliable iteration)?

MAPPED_TEMPLATES = new WeakMap();
function RESTORE_DASH(template, ...expressions) {
  let mappedTemplate = $WeakMapPrototypeGet(MAPPED_TEMPLATES, template);
  if (!mappedTemplate) {
    // Map `template` to a replacement array.
    mappedTemplate = [...template];
    const mappedRaw = [...template.raw];

    // Replace each [left, RESTORE_DASH, right] sequence with [left + '-' + right].
    for (let i = expressions.length - 1; i >= 0; i -= 1) {
      if (expressions[i] !== RESTORE_DASH) {
        continue;
      }
      mappedTemplate[i] = mappedTemplate[i] + '-' + mappedTemplate[i + 1];
      mappedRaw[i] = mappedRaw[i] + '-' + mappedRaw[i + 1];
      $ArrayPrototypeSplice(mappedTemplate, i + 1, 1);
      $ArrayPrototypeSplice(mappedRaw, i + 1, 1);
    }

    $ObjectDefineProperties(mappedTemplate, { raw: { value: mappedRaw } });
    harden(mappedTemplate);
    $WeakMapPrototypeSet(MAPPED_TEMPLATES, template, mappedTemplate);
  }

  return [
    mappedTemplate,
    ...$ArrayPrototypeFilter(expressions, expr => expr !== DASH)
  ];
}
erights commented 2 years ago

Yes. Except you should delay putting it into the memo until after the harden, so it doesn’t memo if it errors before then. Also protects against reentrancy in the important direction.

gibson042 commented 2 years ago

I actually considered that, and came to the conclusion that identity-preserving store-once behavior is better than masking failures by overwriting a template rather than exposing an incomplete one. ¯\_(ツ)_/¯

kriskowal commented 2 years ago

Another factor to take into account is that this is an evasive transform that ideally only shifts columns on the line affected and mustn’t change line numbers. If a utility function is necessary, it might be okay to rely on function hoisting and place it at the end of the file.

erights commented 2 years ago

It leaves behind an unhardened and (depending on where the failure is) arbitrarily badly formed template that a second use will retrieve and then use without error. That's worse.

erights commented 2 years ago

I actually considered that, and came to the conclusion that identity-preserving store-once behavior is better than masking failures by overwriting a template rather than exposing an incomplete one. ¯_(ツ)_/¯

What's a scenario where the original attempt terminates with a throw but the identity divergence is observable?

gibson042 commented 2 years ago

What's a scenario where the original attempt terminates with a throw but the identity divergence is observable?

Excellent point. Updated.

If a utility function is necessary, it might be okay to rely on function hoisting and place it at the end of the file.

That can work, but only if the function has a) access to unmodified built-ins, and b) access to its shared-across-invocations map or privilege to create one and retrieve it in a later invocation.

naugtur commented 2 years ago

As for the evasive transform's code itself, it'd be nice to get them out of bundleSource where it's coupled with the surrounding code and offer a bunch of string to string and/or Babel AST evasive transforms to mix and match.

Having trouble reaching those and selecting only the ones I want to use I ended up writing a separate transform here packages/experiment-run-cli/src/sesEvasionTransform.js

kriskowal commented 1 year ago

@naugtur Can I interest you in investigating a solution for this issue? I’m also interested in moving the evasive transforms somewhere more general so they can be used without taking on bundleSource. Maybe another package.

naugtur commented 1 year ago

I'll eventually need those for webpack plugin probably. This seems like routine work. Is there any challenge involved in this other than naming the package? Unless we want to benchmark and optimize performance.

kriskowal commented 1 year ago

Fiddling around with the AST to do a new evasive transform around strings that contain censored patterns is beyond my ken, but probably just routine for you! Otherwise, yes, there’s no challenge factoring out the transforms. No: naming is still hard.

naugtur commented 1 year ago

I may have underestimated, but it should be asking the visitor to go over string nodes and running a split & replace for each. I'm happy to take that work if someone else can set up the package in the monorepo - that's the part I'm less confident I'd do correctly :sweat_smile:

Name proposal: @endo/transforms - not using the word evasive because who knows what else we cram in there.

kriskowal commented 1 year ago

Here is an @endo/transform scaffold https://github.com/endojs/endo/pull/1701

kriskowal commented 1 year ago

@naugtur we now have an empty and unpublished @endo/transforms package on main. Thank you for taking on string splitting. That’s the part I’m not confident I’d do correctly! 😅

naugtur commented 1 year ago

Is there any pressure to get it quickly? I wanted to use it as an opportunity to introduce someone to ast.

kriskowal commented 1 year ago

Is there any pressure to get it quickly? I wanted to use it as an opportunity to introduce someone to ast.

This is not urgent!

erights commented 9 months ago

Note https://npmfs.com/package/readable-stream/4.5.2/lib/internal/streams/operators.js#L430