Open turadg opened 2 years ago
For initialization, one option would be a breaking change to the API, in which the init
function changes from (...args) => initialState
to (...args) => ({ state, ephemera })
. That's probably the most ergonomic to use when ephemera are in play, slightly worse when they are not (e.g. x => ({ x })
becomes x => ({ state: { x } })
), but we'd have to change all the callers. I count 17 files (in zoe, ERTP, and run-protocol) which are likely clients.
Another is adding options.initEphemera = (...args) => ephemera
, which gets the same arguments as init
. I can imagine folks wanting access to the facets and/or state from that function, and we'd need to decide if it's called before or after the main init
, both of which make it somewhat awkward.
If we're going for API breaking changes, I'd much rather we go for what I proposed in https://github.com/Agoric/agoric-sdk/issues/5170 in which case we'd simply have something like init: ({state, ephemera}, x, y) => { state.x = x; state.foo = makeStuff(y); ephemera.computedX = compute(x); }
;
There is an argument to be made about having a separate initEphemera
that is lazily called exactly once before first usage in each version so that the ephemeral data can be reconstructed from durable state when necessary.
IBIS for the design options. Please edit this comment to add your points. Let's not bikeshed on the names yet and stick "ephemera" for any object that holds ephemera. String can change once the semantics are figured out. args
means the args to the kind constructor and state
means what's produced now by initState
(@warner notes that initState()
returns the initial value for state
, but is not the same JS object that will be received as context.state
when behavior methods are called).
Requirements:
make()
arguments (e.g. to hold something ephemerally)ephemera
in their context (if you challenge this, please start another comment with another IBIS)vaultDirector
passes dependencies in new makeVaultManager
callCases to handle:
? How should the ephemera object be created?
: initEphemera
provided in options
as (state, ...args) => E
.
1.1 + can be provided as needed when lost
1.1.1 - not without the constructor arguments
: existing initState
changed to return ({state, ephemera})
2.1 - breaking change
2.1.1 . mechanical fix and if it's a better API now's the time
: primary init
changed to ({state, ephemera}, ...args) => void
3.1 - breaking change
3.1.1 . mechanical fix and if it's a better API now's the time
finish
receives an ephemera
to populate
4.1 - shouldn't receive constructor arguments
ephemeral state dependent on parent state comes in a stub durable object that holds it
6.1. . vaultDirector creates factoryPowers
as a druable object and vaultManager holds that in durable state
factoryPowers: ({ state, self }) => factoryPowersWM.get(self)
7.1 . vaultDirector provides factoryPowersWM
to vaultManager
factoryPowers: ({ state, self }) => provide(factoryPowersWM, self, () => makeFactoryPowers(state));
@Fudco and I walked through the options this afternoon. We found problems with most of the proposals above, and came up with two new ones that we think might work.
The main constraint is that the ephemera
needs to be created both in the first version of the vat (at about the same time that the durable object is created, i.e. the vref
is allocated), but also in the second+subsequent versions of the vat (either when the durable object is first deserialized, or when a method that needs ephemera
is first invoked). The first call is associated with a call to the init()
function that creates the initial state, but the second is not. So any proposal that attempts to create ephemera
from init()
is doomed. That takes out the IBIS proposals 2, 3, and 4 (because finish()
is called the same number of times as init()
).
It also takes out the portion of proposal 1 that passes ...args
to an initEphemera
function, because those args
(the maker args) are not available in the second+subsequent versions (even if they were durable, we wouldn't want to keep them around in durable state across version upgrades: they're initialization args, not state).
Proposal 5 is kinda flipped around, I was there when we came up with it but I can't parse it well enough to consider. Proposal 6 is close to the "open coded" approach that Chip and I were using as a jumping-off point, which I'll continue here. The following pseudo-code is what a userspace author might do on their own, if the VOM didn't provide any better tooling:
function createVaultDirector(VDstuff) {
const ephemeraWM = new WeakMap();
function provideEphemera(vm, state) {
if (!ephemeraWM.has(vm)) {
const ephemera = createEphemera(VDstuff, vm, state);
ephemeraWM.set(vm, ephemera);
}
return ephemeraWM.get(vm);
}
const init = (args) => initialState;
const behavior = {
doFoo({ self, state }, ...fooArgs) {
const ephemera = provideEphemera(self, state);
doStuffWithEphemera();
},
doBar({ self, state }, ...barArgs) {
doStuffWithoutEphemera();
},
};
const options = {};
const makeVaultManager = defineDurableKind(handle, init, behavior, options);
function createVaultManager(VMstuff) {
dostuff();
const vm = makeVaultManager(args);
return vm;
}
return { createVaultManager };
}
In that example, the parent code (VaultDirector) must create a WeakMap and a provide
pattern that is keyed by the VaultManager
instance (which is a Representative of a durable object). It can create ephemera
with access to anything passed into createVaultDirector
, stuff you create within the VaultDirector, the VaultManager
instance itself (either used to interrogate the VaultManager, or to key some other Store or WeakMap), and contents of the VaultManager's state
.
Then, inside every method that wants to use this ephemeral data, it must call provideEphemera(self, state)
to get it. The first time this is called within version-1 of the vat, provideEphemera()
will take the createEphemera()
branch. Every subsequent time within version-1, this will fetch the stored copy from the WeakMap. (If the VaultManager
durable object is GC-released within version-1, the WeakMap entry will go away, taking ephemera
with it, but that doesn't happen just because any particular Representative for that underlying durable object is GC'ed, so the ephemera
continues to consume RAM, making this only suitable for low-cardinality Kinds).
Then, after upgrade, a new WeakMap is created, initially empty. The first time someone talks to one of the old (durable, vref
still exists) VaultManager
s, a new Representative is created. Following that, the first time someone calls doFoo()
, then the version-2 createEphemera
will get called again, and create the ephemera
for this VaultManager that will last for the rest of version-2 (or until the durable object is GCed, as before).
We think this pattern would work, and could be implemented without VOM changes. But it's not particularly ergonomic. The biggest pain points are:
ephemera
-desiring user method must include the ephemera = provideEphemera(self, state)
boilerplateKeeping in mind the requirement that "Kinds which don't want ephemera
should not pay for it" (and high-cardinality Kinds in particular must not pay a RAM cost for it), here's a sketch of what VOM support could look like:
function createVaultDirector(VDstuff) {
function createEphemera(state) {
// now use VDstuff and VaultManager's state to create
// ephemeral stuff for each vaultManager
// this ephemera can be anything, it doesn't even have to be an
// object, and will not be hardened
const ephemera = anything;
return ephemera;
}
const init = (args) => initialState;
const behavior = {
foo({ self, state, ephemera }, ...fooArgs) {
},
bar({ self, state }, ...barArgs) {
},
};
const options = { createEphemera };
const makeVaultManager = defineDurableKind(handle, init, behavior, options);
function createVaultManager(VMstuff) {
dostuff();
const vm = makeVaultManager(args);
return vm;
}
return { createVaultManager };
}
The new options.createEphemera
triggers the VOM into creating one internal WeakMap
per Kind, keyed by the Representative (just like the open-coded form above, using the same VirtualObjectAwareWeakMap
that userspace gets, which means it's really keyed by the durable object's vref
). The VOM does the provide pattern, using the user-supplied createEphemera()
function. Each time a Representative is created, we fetch (and/or create) that object's ephemera
, and then add it to the context
argument that gets bound into the methods.
By passing it in context
to each method, it is available directly (without additional provideEphemera()
boilerplate) to all code that wants it. Methods that do not need it (bar()
) just don't destructure it in their { self, state }
argument code.
ephemera
can be anything the user code wants, even a non-object, and it is not frozen or hardened. So user code could create a mutable record and use the properties as volatile state.
In version-1, the user's createEphemera()
method will be called during makeVaultManager()
(after init()
and before finish()
, if any). In version-2, it will be called the first time the durable object is deserialized into a Representative (slightly earlier than in the open-coded example, where it doesn't get called until foo()
needed it, and bar()
did not).
Because createEphemera()
is called before the Representative can be created, we cannot provide it with the VaultManager instance (the vm
argument in the open-coded example). Imagine if createEphemera()
called vm.foo()
.. what ephemera
would that method get? This might be an imposition on Kind authors: any distinction between the ephemera
provided to one instance versus another must come from differences in their respective state
contents.
This is probably the biggest limitation of this approach, but it is the price paid for removing the boilerplate and receiving { ephemera }
through the context
argument.
Within the VOM, the context
record ({ self, state, ephemera }
or { facets, state, ephemera }
) must be carefully constructed: ephemera
is not hardened, but the context
record is frozen, and all three properties are non-writable and non-configurable. The context
object's lifetime is linked to the cohort of Representatives (to prevent a GC sensor). We don't need to establish such a link with ephemera
because ephemera
is already kept alive by the durable object's vref in the internal WeakMap, so ephemera
cannot go away while the durable object exists. Userspace can sense when the first Representative (within any given vat version) is created: just wait for createEphemera
to be called. But it cannot sense if/when a second Representative is created: createEphemera
is never called again (within that vat version), which means this does not provide a GC sensor.
(It would be slightly easier/safer to build context
if we could harden ephemera
.. userspace would need to use a Map
or Set
instead of simple mutable records or arrays, but they could still hold Promises and objects/functions that close over other mutability).
I'll describe the second proposal we came up in a separate comment.
Our second proposal is a bit more radical. We realized that we're currently providing three-ish tools, with various values of two orthogonal properties:
low-cardinality | high-cardinality | |
---|---|---|
non-durable | plain objects-as-closures | makeKind (virtual) |
durable | makeDurableKind |
The fourth corner wants a tool for data that is durable but of low-cardinality (so we can afford to spend RAM on each instance). A lot of the singleton Kinds we're building for contract upgrade (ZCF, the contract instance) fall into this category, but some of the friction is because our only durable tool is made for high-cardinality data.
We sketched out a fourth tool, with a strawman name of defineExpensiveDurableKind
(or maybe defineUpgradableKind
), that woud be used like this:
function makeBehavior(state) { // called once per version, during first unserialize
// could mutate 'state' here
let ephemera;
return {
doFoo({ self }, ...fooArgs) {
// can read/write ephemera. 'self' has an identity.
},
doBar({ self }, ...barArgs) {
},
doBarMulti({ facets }) {
},
};
let maker = defineExpensiveDurableKind(handle, init, makeBehavior, options);
This approach would call makeBehavior
the same times as createEphemera
was in the previous example: during init()
in version-1, and during first deserialization in version-2. However the Representatives would be pinned in RAM, never to be released until upgrade caused the vat version to stop. This prevents userspace from sensing GC by counting calls to makeBehavior()
.
By calling a user-provided function once per instance, we could create ephemera
as closed-over variables, available to all behavior functions, instead of passing it through context
. state
is also closed-over, but is implemented as the same "hardened record of getters/setters for known state properties" as before. init
and options.finish
behave as before.
Note that self
must still be passed through context
, because self
is not yet defined within makeBehavior
(the return value of makeBehavior
is not self
, instead it is a record of context-taking functions that must be copied/bound into a newly-synthesized object that curries context
appropriately).
Also note that makeBehavior
does not get access to the args
which init()
receives, because makeBehavior
is called in version-2 (not just version-1), by which time those args
are long gone.
The makeBehavior
has the opportunity to mutate state
as the object is first created and/or unserialized. This might not be a good idea, but on the other hand it might be a great place for schema upgrade to happen.
Argh, nope, both options.createEphemera
and defineExpensiveDurableKind
's makeBehavior
run into a problem: we disable metering during deserialization, and both would run user code within that time, allowing userspace to cheat on metering.
We disable it because deserialization might encounter vrefs which refer to virtual/durable objects, whose Representatives may or may not already be in memory (they are tracked with a WeakRef). If we don't currently have a Representative, we must build one, which costs more meter usage than if we skipped it. That would give a GC sensor to anyone watching the meter. To avoid that, we sandwich the marshaller.deserialize()
call in a disableMetering
block. But that means everything during deserialization, including the call to user-provided createEphemera()
.
The number of times createEphemera()
is called is not a GC sensor, but the flip side is that createEphemera()
could do something really expensive, and it wouldn't be captured by the meter. And all userspace activity is supposed to be captured by the meter.
The open-coded approach doesn't suffer from this because provideEphemera
is called after deserialization is finished, so it's all in userspace. But that boilerplate is pretty annoying.
Why not make context.ephemera
a getter? That would allow to:
initEphemera
/createEphemera
until the first time a facet is used in a given version (or until the ephemera is used by a method of a facet?)initEphemera
/createEphemera
, it can pass in the facets (and state), but any re-entrant call into get ephemera
would throw.initEphemera
/createEphemera
during metered time, just before (or while) delivering a message to the objectThis getter would roughly be equivalent to a provideEphemera
implemented in userland. I do think that we should find a way make it easy for userland to implement provideEphemera
to get some experience before moving into the the VOM. One option would be to allow the state
object to be used in a userland WeakMap
key.
Regarding the makeBehavior
idea, I am feeling somewhat uncomfortable with it, probably because it reverts back to a closure over state model, and enshrines using the state
object outside of calls to a behavior method.
Oh, that's clever.. yeah I think making it a getter would address the problems I raised.
Defer call to initEphemera/createEphemera until the first time a facet is used in a given version (or until the ephemera is used by a method of a facet?)
Yeah, if the method is defined as foo: ({ state, ephemera }) => stuff
then it'll get created as soon as that method is called, but if the method does foo: context => stuff
then it waits until stuff
does context.ephemera
or { ephemera } = context
which might only happen inside a conditional. But all of that is a deterministic function of userspace behavior, and all of it happens after any Representative-creating deserialization takes place, so it'll be metered along with the rest of userspace.
I do think that we should find a way make it easy for userland to implement provideEphemera to get some experience before moving into the the VOM. One option would be to allow the state object to be used in a userland WeakMap key.
Hm, state
can already be used as a WeakMap key (it's a regular Object
, a bag of getters/setters, without value properties, without vref identity, and with some tricks to make sure it has the same lifetime as the facets it supports), so userspace could write the open-coded provideEphemera
approach today. Are you thinking of something in between "just write it yourself" and options.createEphemera
? Maybe options.provideEphemera = state => doStuffAroundYourOwnWeakmap
? (or maybe we pass the whole context
object in). I'm not sure I see how that's more educational or flexible than having the VOM manage the WeakMap.
Regarding the makeBehavior idea, I am feeling somewhat uncomfortable with it, probably because it reverts back to a closure over state model, and enshrines using the state object outside of calls to a behavior method.
Yeah, needing the state
or context
to index the WeakMap necessarily means they appear outside a behavior method.
And the "closure over state" model is exactly what it's trying to salvage, for the use case where we're closing over shared (ephemeral) state, and only use state.propname
for per-instance (durable) state. All of this is a struggle to retain as much of the "elegant" objects-as-closures model in the face of requirements for high-cardinality and/or durability/upgradability.
The defineExpensiveDurableKind
approach would allow a singleton use case that needs a durable identity, but whose durable state is pretty immutable (and easier to manage with baggage
and/or reconstructed entirely at upgrade time), to close over everything, and not even have a state
. I still don't know if it's a good idea, but it might be the closest we could get to the original objects-as-closures while still enabling upgrade (and preventing the GC sensor).
Yeah, if the method is defined as
foo: ({ state, ephemera }) => stuff
then it'll get created as soon as that method is called, but if the method doesfoo: context => stuff
then it waits untilstuff
doescontext.ephemera
or{ ephemera } = context
which might only happen inside a conditional.
Yes, and there is also the option to explicitly create the ephemera before calling the behavior method the first time (not during the getter), and prevent calling any facet methods during the ephemera creation. This would be a more heavy handed "guard" to prevent potential footguns (conditionals ephemera init), but might prevent some legitimate use cases (internal behavior facets used for ephemera init).
Hm,
state
can already be used as a WeakMap key
Oh I thought we explicitly disallowed that.
I'm not sure I see how that's more educational or flexible than having the VOM manage the WeakMap.
It's not, I just wanted to get ephemera experience before moving it into the platform, see if that approach actually solves problems. Basically have userland implement const provideEphemera = (state, facets) => {}
using the state
object as the WM key, and optionally facets
for its logic if necessary.
Yeah, needing the
state
orcontext
to index the WeakMap necessarily means they appear outside a behavior method.
I still very much would like to "disable" the state
object props outside of behavior methods invocation to prevent this pattern. At least in the WM case, using the identity would be allowed.
What is the Problem Being Solved?
Changing requirements should be efficient
When developing contracts, requirements may evolve and it helps developers to minimize the effort necessary to implement changes. One such requirement is the durability of data. It can be:
In the Virtual Object Manager (VOM) now transitioning a value from V→D or D→V is pretty declarative. Change a function name or add an option flag. But changing →H or H→ is laborious. See https://github.com/Agoric/agoric-sdk/pull/5736/ commits example.
The resulting code should also be clean.
The movements above resulted in the simple https://github.com/Agoric/agoric-sdk/blob/39fe09285a230a33966b23f3ab3a1f61127f0a64/packages/run-protocol/src/vaultFactory/vaultManager.js#L529
turning into https://github.com/Agoric/agoric-sdk/blob/e4c837a47ab31c31628ff8a1a736aa3a139f0044/packages/run-protocol/src/vaultFactory/vaultManager.js#L557-L561
Representatives cannot reveal GC
https://github.com/Agoric/agoric-sdk/pull/5758 tried to solve the above problems by letting some heap data onto the
state
object. The problem with this is if that userspace could tell when GC happens because one Representative will have those properties, while later Representatives will not.Description of the Design
A design that seems to satisfy the above requirements is to continue to use the WeakMap pattern as in https://github.com/Agoric/agoric-sdk/blob/39fe09285a230a33966b23f3ab3a1f61127f0a64/packages/run-protocol/src/vaultFactory/vaultManager.js#L140-L147
But instead of making the module responsible for the map and each method responsible for pulling the
ephemera
object out of it, have the VOM provideephemera
in the context next tostate
. So the ugly refactor above could be back to one line,TBD how the
ephemera
object gets initialized. Some requirements:finished
, like other context consumers needsephemera
.state
so can't be beforeinitState
(example wrapping a durable store in ephemeral behavior)initEphemera(state, ...args)
Security Considerations
Test Plan