endojs / Jessie

Tiny subset of JavaScript for ocap-safe universal mobile code
Apache License 2.0
281 stars 16 forks source link

`immunize` as a Jessie endowment #27

Closed michaelfig closed 5 years ago

michaelfig commented 5 years ago

Hi, especially to @erights,

This issue is to track discussion around the use of an immunize endowment in Jessie. Please refer to my implementation of it in a Jessica branch, which begins with:

// Recursively freeze the root, a la harden.  If it is a function
// or contains a reachable property that is an function, that
// function will be replaced by a memoized hardened wrapper that
// immunizes its return value.
//
// A baroque Proxy or frozen object cannot be immunized, but will still be
// hardened.  These are objects that cannot possibly contain mutable Jessie
// objects (since all Jessie objects have been immunized before export), so
// this incompleteness does not compromise Jessie.

I am proposing that all module-level bindings in Jessie must be const bindings to pure (i.e. no side-effect) expressions that are wrapped in immunize(expr) if they cannot be syntactically shown to be a plain identifier, number, string, boolean, or null.

const MY_NUMBER = 123; // Valid: is a number.
const MY_OTHER_NUMBER = MY_NUMBER; // Valid: plain identifier.
const MY_STRINGIFY = immunize({toString(): { return 'stringy'; }}); // Pure object, needs immunize.
const makeFoo = immunize(() => { // Function, still a pure expression, but needs immunize.
  let counter = 0;
  return function foo() {
    counter += 1;
    return counter;
  }
});
export default makeFoo; // Valid: plain identifier.

As part of this proposal, Object.freeze and harden would be removed from the Jessie whitelist, as they would provide a way of bypassing the above caveat (this incompleteness does not compromise Jessie).

This proposal changes the flavour of Jessie. Rather than more subtle static checking rules, immunize provides blanket coverage for all the cases where a mutable object can escape the module that defined it. I strongly recommend you have a look at the actual immunize implementation, and the flavour of a complex module written to use it, which you can compare side-by-side with the harden-using version.

I do have some future directions if this proposal is accepted, namely reintroducing named exports to Jessie (which are immutable and side-effect free by the above rules, and so just produce some added sugar for implementing modules rather than destructuring their default export). Also, there's nothing preventing a future revision of Jessie from relaxing this constraint on Object.freeze and classic harden.

Questions and comments are appreciated, Michael.

michaelfig commented 5 years ago

Also, I encourage you to have a look at the Jessie grammar changes needed to support these rules. They are simple and concise, leveraging the grammar to impose the static checks without further processing.

[I do, however, need to clean up allowing simple identifiers as a valid immunizedExpr, since they all necessarily have been defined previously in the module.]

michaelfig commented 5 years ago

@erights, in make-hardener #27, you posed an example:

function foo() {}
const dag = { foo, foo };
// dag.foo === dag.foo

immunize(dag);
// dag.foo !== dag.foo

I don't know what you meant by the syntax { foo, foo }. In Jessie, that evaluates to a single object literal {foo: foo}. Can you elaborate on what you mean by that example?

At any rate, the closest I can come to being like your example in my proposed restricted Jessie syntax is:

const foo = immunize(() => {});
const dag = immunize([ foo, foo ]);
// dag[0] === dag[1] === foo

(with the current implementation).

Thanks, Michael.

dckc commented 5 years ago

An example you gave earlier reminded me of the errors I repeatedly get when trying to make sure all the exports of my Monte modules are DeepFrozen:

function bar {
  return bar[0];
};
const foo = harden(()  => {
  bar[0] ++;
  return bar();
});
export default harden(foo);

Monte's DeepFrozen is specified as:

The specific property proven by DeepFrozen: For any DeepFrozen object, all bindings referenced by the object are also DeepFrozen.

I'm not sure how / whether bindings translate to JS, but function bar would need a DeepFrozen guard, at which point any use of mutable bindings wouldn't pass the static check.

We'd get this diagnostic, which, as I say, is all too familiar to me:

        if not deepFrozenSupersetOf(guard):
            errors.append(u'"%s" in the lexical scope of %s does not have a '
                          u'guard implying DeepFrozen, but %s' %
                          (name, audition.fqn, toString(guard)))

https://github.com/monte-language/typhon/blob/master/typhon/objects/auditors.py#L339-L342

Are you familiar with http://wiki.erights.org/wiki/DeepFrozen ? I'm not sure how much of the history of E you have followed.

Corbin gave a talk about Monte modules at OCAP 2017. The recording is about 30min.

michaelfig commented 5 years ago

Thanks for the feedback @dckc!

As for your example, Jessie is a chance to be even more restrictive (IMO, all module-level bindings need to be immunized, but others should not (and this would likely be enforced by the Jessie grammar, as it is a mistake to use it elsewhere or redefine it within Jessie)).

DeepFrozen doesn't seem to have the same fundamental requirement as immunize: it seems to me that it says nothing about the return values of DeepFrozend functions.

dckc commented 5 years ago

That's right: a DeepFrozen function can create objects with mutable state etc. Is that a problem? I suppose I missed something.

michaelfig commented 5 years ago

One idea behind Jessie is that directly mutable state (i.e. unhardened objects) is not permitted to escape the module in which it is defined. You can still have hardened methods that mutate closed-over state (as in foo() in the top comment.

dckc commented 5 years ago

mutable state isn't exportable, sure. But what's wrong with exporting functions that create objects-as-closures?

e.g. the hello-world example from many of MarkM's talks...

function counter() {
  let current = 0;
  return harden({
    increment() { current += 1; return current; },
    decrement() { current -= 1; return current; }
  });
}
michaelfig commented 5 years ago

Objects-as-closures are great! As in MarkM's example, the object must be hardened before it can be returned from the counter() function, or else it is open to tampering from the module's importer (i.e. the less trusted Jessie host environment, such as SES or plain-old-Javascript).

immunize at the module level provides a systematic way of doing this hardening regardless of how deeply nested the returned object is.

erights commented 5 years ago

At https://github.com/Agoric/make-hardener/pull/27#discussion_r267156553 I wrote

function foo() {}
const dag = { foo, foo };
// dag.foo === dag.foo

immunize(dag);
// dag.foo !== dag.foo

At https://github.com/Agoric/Jessie/issues/27#issuecomment-474678060 @michaelfig wrote

I don't know what you meant by the syntax { foo, foo }.

Quite right. My code is confused. What I meant was something more like:

function foo() {}
const dag = { foo: foo, bar: foo };
// dag.foo === dag.bar

immunize(dag);
// dag.foo !== dag.bar
michaelfig commented 5 years ago

The wrap() function implementation in the immune code I referred to keeps a memo of the values we wrap so that it returns the same wrapper each time. In that case, dag.foo === dag.bar even after immunizing.

dckc commented 5 years ago

So... this is a proposal to relax static constraints around harden(...) and replace them with runtime memoization? (almost like a membrane)

That doesn't seem like a good trade-off, to me.

erights commented 5 years ago

To clarify a bit more: Jessie disallows mutable properties on Jessie-created objects that may have escaped reliable static tracking. Jessie certainly does allow mutable state: Let variables are assignable to, and can be captured by closures. Jessie includes Map, Set, WeakMap, WeakSet, Promise, all of whose instances have hidden mutable state. Jessie code also must defensively assume that objects gets from outside may have come from SES, and therefore may have mutable properties.

The harden function does ensure that the object it is applied to, and all objects reachable from it only by own property walk have no mutable properties. But this does not include, for example, objects returned from hardened functions. immunize is intended to address this, making more of the API surface hardened, such as the values returned by hardened functions.

dckc commented 5 years ago

So the approach of just using harden(...) in the right places is infeasbile? Jessie has to have a runtime memoization overhead for every call to every exported function?

michaelfig commented 5 years ago

The only memo overhead is when a return value itself returns a new closure somewhere in its graph. There is no additional overhead when the return value has no new (unwrapped) closures reachable from it.

erights commented 5 years ago

So the approach of just using harden(...) in the right places is infeasbile?

My own position at this point is unchanged from what it was: Essentially harden but not immunize. Programmer needs to manually insert harden calls at the right places. Static verifier needs the enforce that the programmer has done so. I definitely think this approach is feasible.

At this point, none of my worries about immunize are runtime overhead. As you @dckc observe, @michaelfig 's approach is membrane-like. However, membranes intermediate, which I find much easier to think about than immunize, which does surgery in place. I find immunize disturbing and remain skeptical. But that doesn't mean it is wrong. I remain open to arguments in both directions.

dckc commented 5 years ago

Isn't there some cost to determine whether the returned value has any such reachable closures?

Isn't there overhead to checking whether the function has already been called with these inputs? Or is this some usage of "memoization" that I'm not familiar with?

erights commented 5 years ago

Question: Does immunize prevent the leakage of the non-hardened empty object in the following code?

const foo = immunize((bar) => bar({}));

It leaks the empty object by passing it as an argument in a function call, not by returning it. A full membrane would catch this. To be clear, I am not advocating a membrane either, and I don't think either of you are. But it is a clarifying contrast.

erights commented 5 years ago

I will toss into the mix another approach which I have been consciously avoiding: Rewrite Jessie-without-explicit-harden (need a name for this) to Jessie, by inserting the harden calls that Jessie requires.

Given that mutable properties are effectively absent from standalone Jessie, this rewrite would be a noop to standalone Jessie semantics. A program written in the language, if run without first rewriting in a SES environment, would no longer be defensive. In order to attribute to the pre-rewrite program a defensive semantics, the language it is written in (needs a name) is not a subset of Jessie or SES. Rather, Jessie itself is (what I have elsewhere called) a fail-stop subset of this language. Any Jessie execution that does not attempt to mutate a non-mutable property would run the same way in this other language.

michaelfig commented 5 years ago

bond() is the other function I've proposed to mesh with Jessie static checking. It already generates wrappers for functions of unknown origin, so why not extend it to immunize the function's arguments before calling the original function? To extend the metaphor, a bond()ed function is one which cannot abuse a captured this, nor abuse any of its arguments.

So, your example becomes:

const foo = immunize((bar) => bond(bar)({}));

which does have the behaviour you'd hope for.

Static analysis for situations that require bond cannot be avoided or made syntactic without an extremely unpleasant programming experience. And I would argue it's the untrusted function you want to mark, not its arguments. I've implemented the argument-immunizing bond in the Jessica immunize branch.

I would challenge you to write the following without immunize() and an argument-immunizing bond() (i.e. just with harden() calls):

const makeJessie = immunize((peg) => {
   const { SKIP } = peg;
   return bond(peg)`... ${a => ['module', a]} ... hundreds of other holes ... ${SKIP} ...`;
});

export default makeJessie;

Without immunize(), we rely on peg not creating objects with mutable state, which might not be the case if it originates in SES and not Jessie. Without bond() immunizing its wrapped function's arguments, we have to wrap every single expression in a harden() call.

[I'm only suggesting immunize() and immunizing bond() because in Jessica, I have a growing body of Jessie code that has revealed pain points with explicit harden().]

michaelfig commented 5 years ago

Now I have something that seems workable. I'm using the Typescript Compiler API to compile and rewrite *.mjs.ts sources (Typescript-compatible) into valid Jessie code in *.mjs. I think I'll call it Tessie, and the implementation is beginning here.

Tessie can insert the necessary bond() and immunize() annotations, which will still be verified by Jessica's own static analysis. The pairing of bond() and immunize() in vanilla Jessie is still a good one, in my mind, so I'll leave Tessie at that until we get a clearer direction.

erights commented 5 years ago

I like Tessie!!

(Your "is beginning here" links to https://github.com/michaelfig/jessica/blob/tessie/lang/nodejs/tessc.ts which seems to be a broken link.)

Since TypeScript to JavaScript is a rewrite anyway, having Tessie be approximately the Jessie subset of TypeScript, we can add new failure conditions to the rewrite, and have Tessie be a fail-stop subset of TypeScript.

By "adding new failure conditions", I mean that code in the fail-stop-subset language which does not provoke these failure conditions will run the same way in the superset language. This is only fail-free functionality preserving. It is not semantics preserving or enforcement preserving.

Rewriting to insert freezes, whether through harden or immunize, does not affect the execution of programs that don't try to mutate those properties. Rewriting to insert bond isn't a fail-stop transformation, but perhaps a safe variant of it would be, or would be close enough.

Dean had an interesting suggestion: Perhaps Tessie could be soundly typed, by having the rewriter insert dynamic type checks. Fits. A program that does not fail these dynamic checks would execute the same way under TypeScript.

michaelfig commented 5 years ago

Sorry for the broken link, I edited it above to be: tessc.ts.

michaelfig commented 5 years ago

Here is a video with @erights speaking about immunize() and Tessie. There is just one point Mark raises that I didn't fully understand yet: the way he describes it, immunize(fn) also immunizes fn's incoming arguments.

Af first I thought that would be undesirable (and was proposing an addition to bond() for this purpose), but now I agree. Since immunize() is only being called on exported values, it shouldn't cause problems with the use of mutable objects that are internal to the module.

I went a few steps further:

  1. As before, immunize(root) always results in harden(root).
  2. As before, immunize(obj) for a value with properties, surgically replaces obj's properties in-place with their immunized version.
  3. As before, immunize(fn) for a function fn creates a wrapper that: a. As before, immunizes fn's return value before returning it. b. Now immunizes fn's arguments before passing them to fn. c. Now immunizes the captured this value before passing it to fn. d. Now immunizes fn's thrown exception before rethrowing it.

With 3c, I would then argue that bond() becomes unnecessary, as the only possible captured this-values are guaranteed to be immunized!

I have implemented this in Jessica's master branch.

michaelfig commented 5 years ago

With the above implementation of immunize(), there is no need for static analysis of Jessie anymore, since the requirement for immunize()ing all module-level declarations can be syntactically enforced (as it is in Jessica's master branch).

michaelfig commented 5 years ago

Relative to this, I believe the only remaining gap in the immunize() proposal is the importing of non-Jessie modules (i.e. whose exports may not have been immunize()d). @erights, is this a concern for you?

If this is necessary, I'd like to enforce the requirement to immunize imports in Jessie's syntax. (I would also build a simple rewrite to do so in Tessie, so that Tessie could just use import as usual, without any restrictions.)

We could enforce immunizing imported identifiers with something like:

// Identifiers beginning with `$i_` are the only ones allowed to be imported.
import $i_someModule from './some-module.mjs';
import {foo as $i_foo} from './foo.mjs'; // Legal, provided #28 is accepted
import someModule from './some-module.mjs'; // INVALID: Imported name `someModule` must be `$i_someModule'.
import {foo} from './foo.mjs'; // INVALID: Imported name `foo` must be `foo as $i_foo`.

// `$i_` identifiers are allowed to appear in arguments to `immunize()`.
const [someModule, foo, other] = immunize([$i_someModule, $i_foo, {a: 123, b: $i_foo}]);

// `$i_` identifiers are not allowed in other expressions.
const myFunc = immunize(() => {
  $i_someModule('hello'); // INVALID: "$i_someModule" may only appear as an argument to `immunize($i_someModule)`.
  someModule('hello'); // Acceptable
  immunize($i_someModule)('hello'); // Also acceptable.
});
michaelfig commented 5 years ago

More broken links... I've switched to Babel for the Tessie->Jessie rewrites, which can be found at https://github.com/michaelfig/jessica/blob/master/lang/nodejs/tessc.js

michaelfig commented 5 years ago

I'm withdrawing this proposal, as the insulate() solution discussed in https://github.com/Agoric/SES/issues/103 is vastly superior.

I will create a proposal for insulate() when jessica has bootstrapped.

michaelfig commented 5 years ago

29 is the focal point for discussions of the Jessie standard library.