Agoric / agoric-sdk

monorepo for the Agoric Javascript smart contract platform
Apache License 2.0
329 stars 207 forks source link

per-crank execution fees, meters/keepers #3103

Open warner opened 3 years ago

warner commented 3 years ago

What is the Problem Being Solved?

We expect to charge a fee for execution time (independently of charges for priority of execution). This provides economic backpressure on platform usage, and incentivizes more efficient code.

The basic notion is that each vat is associated with some source of execution credits ("ticks" or "computrons"), the execution of cranks deducts from this source, and platform-level RUN currency is used to replenish the source. The RUN spent to buy computrons eventually goes into a platform stability pool, which is distributed in some economically-interesting way.

Description of the Design

@dtribble and I spent the afternoon brainstorming on this. We've iterated on the topic in the past (#23, for starters, although it talks more about escalators/scheduling than how to charge something once a crank has been selected), so this is one step closer to a coherent design.

The existing codebase gives us:

We expect (#2319) to change our block-scheduling algorithm to accumulate the number of computrons used by each crank, and stop executing cranks once the total has reached some threshold. We hope to pick a threshold that gives us a comfortable amount of runtime (good utilization of the available time, low-to-moderate risk of exceeding the block time). The threshold must be part of consensus (so the set of cranks executed is part of consensus), but could change over time if we find a way to steer it correctly.

Given, that, we start from the lowest levels:

Meters, metercaps

We'll introduce a kernel table that maps meterID to a value (in computrons). The meterID is a kref, and clists will be augmented to translate meters in the same way it currently translates objects and promises and devices. The big difference is that all meters are owned by the kernel, so all vats are importing meters, never exporting them. Within a vat, the meterID turns into a Presence-like Meter object, which has no methods or state, just identity.

Some sort of special device will have the ability to manipulate the balance of a meter, given its meterID. This will also enable the creation of a new meterID, or (eventually) the merging/deleting of meters.

We should consider how meters are destroyed. They'll be reference-counted, with references coming from vats that are running on the meter, as well as vats with Meter objects in their c-lists. Once all of these references go away, we should probably conserve the value it held, so maybe each meter should have a parent, and if/when the meter is deleted, the value is reabsorbed by the parent meter. Or maybe it just gets merged into a common stability pool.

Meters + Keepers, decrementing

Initially, we can start with the kernel associating each dynamic vat with a single Meter (stored in a DB key). Each time a crank finishes, the kernel examines the meter-consumed results, and decrements this Meter by the amount used. If the result goes below zero, the vat is killed.

Later, each dynamic vat will have an ordered list of (Meter, Keeper) pairs. Each Keeper is just an object kref. After the crank, the kernel deducts the consumed computron count from the first meter. If that underflows, the remainder is deducted from the next, etc. If the last meter is exhausted, the vat is killed. For each meter that gets exhausted, the kernel calls the associated Keeper, giving it a chance to replenish the meter if it wishes.

Each time the kernel decrements a meter, it will increment a "total execution" counter by the same amount. This counter will be made available through the metering device, as well as a means to clear it. The goal is to conserve computron credits: they're created when a Meter amount is manipulated by the device, transferred to the kernel when a crank is executed, and then returned to the device when it read+clears the counter.

Open questions:

Vat Creation

The vatAdmin vat's createVat() API will be augmented to accept meters/keepers as options. Both are stored in the kernel's per-vat tables.

Meter Manager Vat

We'll associate the meter-manipulating device with a new vat, similar to (timer device, timer vat) pair, or the vatAdmin pair. The manager vat can provide a clean ocap API for doing things with meters (splits, balance queries, merges).

The meter-manager vat is also responsible for the conversion of RUN tokens to meter units. This is a bit beyond SwingSet's reach, so we need to design this feature to be optional. The cosmic-swingset host application will configure the RUN/computron relationship in its bootstrap process.

To support this conversion, the manager vat should provide a refill facet for each meter, to which a holder can send a RUN Payment to replenish the meter. The vat will deposit the RUN tokens into a locally-held Purse, figure out how many computron credits they're worth, then increment the meter's value by that amount.

Later (perhaps periodically), the manager vat will query and zero the kernel's total-execution counter. It will figure out how many RUN this this computron count is worth, and transfer that amount of RUN into some sort of stability-fee Purse.

Open questions:

Keepers

The role of a Keeper is to get informed when a meter is drained, and then take corrective action. Meters are accessed synchronously, at a low-level (by the kernel), so anything more sophisticated must live in a Keeper or in some vat's object that interacts with one. Keepers are like creditors: they provide funds to make sure an operation doesn't fail (the vat isn't terminated), but they'll have a policy of some sort, whether to refill a meter or let it remain drained (risking vat termination).

Eventually, Keepers might have more options, including suspending a crank (to be resumed later), or perhaps interacting with the scheduling of messages. If we were checking the meter before the crank is delivered, rather than afterwards, there would be a question of what to do if the meter was insufficient: a Keeper might be consulted at that point, and it could choose to refill the meter, drop the message, drop the entire Flow the message was on (if/when we implement Flows), push the message back onto the queue, maybe even suspend the vat until someone pays to thaw it out again.

Keepers and Purses

We're thinking that, for now, we implement Keepers as objects in the manager vat, and we give them a Purse to draw from. If/when their associated meter underflows, the Keeper withdraws enough RUN to fill it back up (if this takes place entirely inside the manager vat, maybe it can all happen in a single crank, just after the meter exhaustion and before the target vat receives any further messages). Whoever supplies this Purse still has access to it, so they aren't irrevocably committing their RUN for use as gas: they can withdraw the remainder at any time.

Perhaps we give the meter-manager vat a widely-held API object that can accept a Purse and create a Keeper around it, with some parameters to control how much it refills the meter. We might have it maintain both a "hot meter" and a "backup meter", filling both from the same Purse but with different refill- or notification- policies.

Initial Computrons

When we get to a proper scheduler, each message on the escalators (or maybe each escalator itself) will be associated with a Meter. When the message is delivered, we should transfer a fixed amount of units from the message/escalator's "scheduling meter" to the vat's "execution meter". This amount should be sufficient to let the vat examine the message and make a decision about whether to proceed or not, ideally after somehow switching to a different meter.

The goal here is to prevent a resource-exhaustion attack in which the attacker just sends a lot of useless messages to the victim vat. If the transferred units are enough to let the defender recognize the uselessness of the message and stop processing, then attacker loses tokens overall, but the defender does not (in fact they may come out ahead).

Vat code may be able to reason about incoming message sends, but cranks are also triggered by incoming promise resolutions (dispatch.notify). This may be difficult for programmers to visualize (do they include decision-making code just after an await too?). And in general, our nascent theories about escalator prioritization of messages are even less developed for promise resolutions.

Switching Meters

To support that "attacker pays" defense, we would like a way to switch meters mid-crank, but we don't have a good theory on it yet. Maybe each message could come with a meter to be pushed (for one crank only) onto the front of the meter stack, and we make an API in which a vat can send a message to itself with this extra meter attached.

Open questions:

Zoe API

We might augment the Zoe "instantiate a contract" API to accept a Purse of RUN along with the other arguments. Zoe could then set up a Meter and Keeper, give the Purse to the Keeper, and call the vatAdmin createVat with the meter/keeper pair.

Since the goal is for the contract instantiator to pay the fees, but to earn enough from their own customers to cover them, one idea is to make the Keeper be a party to the contract (give it a Seat), that allows it to request a fee payout each time it needs to refill a meter.

katelynsills commented 3 years ago

We might augment the Zoe "instantiate a contract" API to accept a Purse of RUN along with the other arguments. Zoe could then set up a Meter and Keeper, give the Purse to the Keeper, and call the vatAdmin createVat with the meter/keeper pair.

I think this should be a RUN payment. Our APIs should never pass purses around.

dtribble commented 3 years ago

I think this should be a RUN payment. Our APIs should never pass purses around.

That would be preferable, but

  1. we want the authority to draw on a pool of resources shared among multiple contracts
  2. we want an easy systemic way to keep the contract operating account "topped-up"

Both of those seem straightforward with shared purses.

warner commented 3 years ago

Also, we must decide how/if refunds can happen. I think we decided that, at least initially, feeding a meter is a one-way street. But that doesn't mean feeding a keeper must also be like that. If the vat you're supporting is terminated, we don't want the funds to be entirely lost. Although I suppose we could make the keeper somewhat more sophisticated and give it a refund() -> Payment method.

katelynsills commented 3 years ago
  1. we want the authority to draw on a pool of resources shared among multiple contracts
  2. we want an easy systemic way to keep the contract operating account "topped-up"

Zoe already has a model of accepting payments and escrowing the assets. That easily satisfies 1. and 2. is easily satisfied by sending another payment whenever it is needed. I think there's no need to reinvent a new model for a use case that is already covered.

Although I suppose we could make the keeper somewhat more sophisticated and give it a refund() -> Payment method.

Zoe conveniently has a refund model too :)

dckc commented 3 years ago

...

Meters, metercaps

We'll introduce a kernel table that maps meterID to a value (in computrons). The meterID is a kref, and clists will be augmented ...

"augmented" presumes the reader knows the status quo of clist design. I'm a little fuzzy on that. I suppose it's documented in https://github.com/Agoric/agoric-sdk/tree/master/packages/SwingSet/docs , but I'm not sure where to start. Does any of the files in that directory serve as a starting point? Are docs on clists hopelessly out of date? (#2452)

dckc commented 3 years ago

This design seems to provide for fees for executing installed contracts. It doesn't seem to address clients of these contracts; for example, users making swaps. Is that on purpose?

p.s. @dtribble confirmed that yes, this is only one part and another part is still in progress.

warner commented 3 years ago

@michaelfig and @mhofman had a neat idea to express the "switch to a different meter" operation, given a limitation of one meter per crank. We give vats some primitive that returns a Promise which will only be resolved by a new crank, and we arrange for that crank to be using the new meter instead of the original one. The simplest form would be like:

async getRequest(request) {
  // now on sender's just-enough-to-decide meter
  const { who, nice } = examineRequest();
  if (!nice) {
    return; // don't waste our time
  }
  const newMeter = customerMeters.get(who);
  await chargeTo(newMeter);
  doWork(); // now on per-customer meter
}

where chargeTo(newMeter) is the swingset-provided primitive. It would create a new vpid (promise vref), tell the kernel about it (getting it into the kernel promise table, with the vat as the decider), subscribe to it (which is weird, if you're deciding a promise you don't usually subscribe to it, but the kernel should accept it because deciders can shift around anyways), do a syscall.resolve with some magic extra argument that includes the meter to use, and only then create and provide an actual Promise object to userspace.

That sequence is fussy enough that we might consider adding a new syscall just to establish a resolved promise with a different meter, all in one single event. Or we create a short-lived object, send a message to it (to ourselves) with a meter argument, and wait for the kernel to loop it back to us.

We discussed other ways to express the primitive. We could put a method on the object that represents the meter (so await newMeter.runOn()), or integrate it with E somehow (await E.chargeTo(newMeter)). We have the resolution slot to work with too: E.chargeTo(newMeter).then(xyz => doSomething()) and what should xyz be?

We've talked in the past about how whatever meter is active when a message is sent should be used by the recipient of that meter. In this initial approach (as designed above), each vat is associated with a meter, not each message. It would be lovely if we could say:

But.. we have no way to tell when the .then is called (unless we perform even deeper surgery on HandledPromise). I think the best we can currently do is to sample the meter at the time the (remote) Promise is created, which is either a turn after liveslots creates on to represent promise IDs within inbound arguments, or a turn after E() creates the result Promise for some outbound message send (during the handler invocation). I'm not sure if that's sufficient.

michaelfig commented 3 years ago

We have the resolution slot to work with too: E.chargeTo(newMeter).then(xyz => doSomething()) and what should xyz be?

The suggestion I was making (I think @mhofman was suggesting similarly) is that a method like E.when() would return something like a ChargablePromise (a platform promise that also has a chargeTo method, to indicate the meter should be switched after the promise resolves and before calling its callbacks):

const value = await E.when(myPromise).chargeTo(newMeter);
// value is the resolution of myPromise
doSomethingWith(value);

E.chargeTo(m) is then just a shorthand for E.when(undefined).chargeTo(m).

The .then usages are like:

// Fire off some promises under separate meters.
E.chargeTo(meter1).then(_ => doSomethingUnderMeter1With(lexicalVariable));
E.when(myPromise).chargeTo(meter2).then(res => doSomethingUnderMeter2With(res));
// Continue synchronously under the original meter.
...

But.. we have no way to tell when the .then is called (unless we perform even deeper surgery on HandledPromise)

As part of the eventual send proposal, we will definitely need the ability to track calls to .then. That's been scheduled for some time, and a partial shim of it may be both necessary for this particular application and useful for the proposal.

erights commented 3 years ago

As part of the eventual send proposal, we will definitely need the ability to track calls to .then. That's been scheduled for some time, and a partial shim of it may be both necessary for this particular application and useful for the proposal.

I don't remember that. What were we thinking of proposing wrt .then?

michaelfig commented 3 years ago

I don't remember that. What were we thinking of proposing wrt .then?

IIRC, we needed delegated promises to be aware when they were subscribed to (maybe not precisely which .then there was).

erights commented 3 years ago

Is this the same issue as why we can't get the ordering correct without platform support? My memory of that issue is that it's because we can't tell when a platform promise is forwarded to another promise. If it's not that, then it still does not ring a bell. Curious!

michaelfig commented 3 years ago

Is this the same issue as why we can't get the ordering correct without platform support? My memory of that issue is that it's because we can't tell when a platform promise is forwarded to another promise.

That's probably what I was confusing needing .then hooks with.

warner commented 3 years ago

At one point I was interested in sensing .then so vats could avoid doing syscall.subscribe(). If a vat does E( E(x).foo() ).bar(), then it doesn't care about the resolution of foo(), it just wants to pipeline bar to it. That would remove a dispatch.notify delivery to this vat.

The syscall API still as room for this: liveslots automatically does subscribe on every exported promise, but the kernel is all set to do less work if liveslots stopped doing that.

warner commented 3 years ago

Today's metering meeting (recorded) examined the idea that each message has a Meter associated with it, and delivery fees would be deducted from this meter. Instead of forwarding a number of tokens from input to output, the output messages would inherit the inbound delivery's meter. There would be limitations placed on the vat's ability to use this inheritance: vattp/comms/zoe would be allowed to inherit the meter, but not contract vats, so contract vats must pay for their own outbound messages. Message deliveries would deduct the message's Meter by some amount based on the size of the message, and then the vat's Meter based on computron usage and syscalls and space usage of the vat code.

Afterwards, @michaelfig and I sketched out an alternative approach:

The nice thing about forwarding tokens, rather than a Meter from which tokens could be deducted, is that it puts a tighter bound on how much the original sender (user) can be charged (the user will always spend the same+predictable amount, a constant function of the current price and the size of the message). And if the user doesn't have enough to cover the fee, we find out about it much earlier (in cosmic-swingset). And the initial fee can be determined entirely by information available in the mempool, so block proposers could avoid including those messages in blocks in the first place, reducing the amount of wasted work and burned fee tokens.

The API that comes out of this is something like:

michaelfig commented 3 years ago

The main accomplishment I understood from this conversation with @warner, is that we could have an initial plan for spam prevention to get to Mainnet phase 1, and refine the plan as we go further.

With a minimal amount of cosmic-swingset work (#3752), we can achieve the bulk of the client spam prevention benefits by charging the same computed A + Bx (message size; the allocation+processing-time proxy) + Cy (slots; the clist-entry proxy) fee, and allowing A,B,C to be tunable by cosmos-level governance.

I think it's important to add that SwingSet doesn't need to be aware of or do anything with this fee until at least Mainnet phase 2 (non-Agoric contracts). For now, we could distribute the collected mailbox fees to stakers, as we already do with Cosmos-level, Zoe, Treasury, and AMM fees.

A lot of the other discussion that has happened so far is valuable in terms of suggestions to offer more granular support for balancing fairness between clients, service operators, and the stakers. Since in the mid-term, Agoric is the only service operator, that puts fewer constraints on proposing a relatively simple starting point that can evolve.