Agoric / agoric-sdk

monorepo for the Agoric Javascript smart contract platform
Apache License 2.0
328 stars 206 forks source link

perf investigation: KREAd, ERTP, NFT Purses, Zoe #8862

Open warner opened 9 months ago

warner commented 9 months ago

Summary: ERTP's large-cardinality NFT Amount handling is O(N) expensive, and KREAd triggers it

I spent some time investigating the mainnet slowdown on tuesday 23-jan-2024, specifically the KREAd MintCharacter wallet action in block 13,499,082 . The packages/SwingSet/misc-tools/monitor-slog-block-time.py, when fed the slogfiles of a follower (in this case my follower named agreeable), prints one line per block, with inter-blockTime and some swingset data:

  13499080    6( 8.9)   9.867 ->    0( 69.5)          0  0.002( 1.638)  +   0.004 =   0% ( 9.6) [ 75/  0/ 75] r0 (CFC45F)
  13499081    9( 9.1)   6.866 ->    0( 69.2)          0  0.001( 1.138)  +   0.003 =   0% ( 7.4) [ 97/  2/ 99] r0 (5F329F)
  13499082    6( 8.0)  11.994 ->  674(100.5)  194136238 15.471( 1.256)  +   0.003 =  58% ( 7.4) [ 99/  0/ 99] r0 (E29E61)
  13499083   11( 7.6)  30.321 ->    4( 88.3)  137700412 10.916( 1.681)  +   0.002 =  44% ( 8.8) [ 66/  0/ 66] r1 (E29E61)
-    block  blockTime    lag  -> cranks(avg) computrons  swingset(avg)  +  cosmos = proc% (avg) [sigs2/3/tot] rounds proposer
  13499084   29( 7.7)  21.861 ->   50( 90.8)  193691028 14.165( 2.389)  +   0.004 =  59% (11.8) [ 75/  0/ 75] r1 (23E1B0)
  13499085   21( 8.4)  29.981 ->  292(105.4)   13829768  0.986( 2.439)  +   0.004 =   6% (12.1) [ 75/  1/ 76] r0 (BB5E2C)
  13499086   29( 9.6)   4.392 ->    0(105.4)          0  0.001( 2.439)  +   0.002 =   0% (12.1) [ 78/  0/ 78] r1 (BB5E2C)
  13499087    3( 9.4)   7.504 ->    0(105.1)          0  0.001( 2.438)  +   0.002 =   0% (12.1) [ 77/  1/ 78] r0 (996453)

which indicates that we had elevated block times (11s/29s/21s/29s, instead of the usual 6s), four non-empty blocks in a row, and about 25 validators fell behind (they were unable to vote in time to be counted). It took about ten blocks for them to catch up.

Slogfile Investigation

Feeding the same slogfile into my classify-runs.py tool (which will eventually land in packages/SwingSet/misc-tools/) summarizes this action as:

{"classification": "kread-mint-character", "runids": ["b13499082-r5", "b13499083-r0", "b13499084-r0", "b13499085-r0"], "computrons": 493433524, "elapsed": 37.98}

This action required a total of 493,433,524 computrons, which exceeds our currently configured maximum of 65Mc/block, so it spilled over into multiple blocks. The action started as the 6th run (r5) of the first block, consumed the second and third entirely, and finished in the fourth. The four runs combined required 37.98 seconds of SwingSet computation time (on the follower from which I took this sample). The four blocks, including actions that shared the first and last blocks, looked like:

action runID computrons seconds
push-price b13499082-r1 15,282,850 1.0311628211056814
push-price b13499082-r2 15,266,286 1.2485291080083698
push-price b13499082-r3 15,254,469 1.2618149660993367
kread-mint-character b13499082-r5 148,328,225 11.928104420076124
kread-mint-character-continuation b13499083-r0 137,700,412 10.915490992018022
kread-mint-character-continuation b13499084-r0 193,691,028 14.165184138924815
kread-mint-character-continuation b13499085-r0 13,713,859 0.9705590480007231
vbank-balance-update b13499085-r2 115,909 0.01444947812706232

I used the same tool to scan for all kread-mint-character actions in the chain's history, and noticed that the number of computrons is growing nearly linearly with the number of characters minted:

kread-mint execution computrons by event

classify-runs.py also extracts the slogfile for the four combined runs. From this I found that the first block was ended by a 137Mc mintAndEscrow() delivery to v9-zoe (crankNum c44159622), causing the block to finish with a total of 148,328,225 computrons (i.e. it spend about 11Mc in smaller deliveries first, then the large mintAndEscrow() finished, then the runPolicy scheduler observed the computron limit was exceeded and ended the block, leaving the remaining work for the next one). On my follower, this delivery took 10.95 seconds.

The second block was ended by c44159626, which was another 137Mc mintAndEscrow(), taking 10.84 seconds. The third was ended by c44159676 which was a 153Mc exitSeat() to v9-zoe that took 12.03 seconds.

I fed these three deliveries into my find-slowness.py tool to identify gaps between slog events that exceed one second. This reveals bursts of computation that happen between the vat worker's receipt of the results of one syscall, and its emission of the next syscall. We can use this to deduce where acute CPU effort is being spent. It would not help us identify chronic CPU usage (small delays interspersed over a large number of syscalls), however none of these deliveries had a significant number of syscalls: they had 113/113/116 syscalls respectively.

The mintAndEscrow() deliveries both had gaps of 2.5s, 5.0s, and 3.2s , the first of which occurred immediately after a vatstoreGet() of a VOM state record (v9 o+d391/2) that was unusually large, about 200kB. This virtual/durable object turns out to be a Purse, of the KREAd CHARACTER Brand, whose Amount contained 380 separate records. So this purse contains 380 NFTs, and our code to manipulate these sorts of balances turns out to be expensive, and also O(N) in the number of records/characters contained therein. The act of minting a character needs to interact with this large Purse several times (described below), and the elevated computron and elapsed time consumed are a result of that expense. The surprisingly expensive operations include:

Building a Benchmark

To investigate further, I adapted the KREAd test-minting.js unit test into a benchmark tool. I removed the error-case-testing portions and stripped out everything but the minting of a new character. I also increased the number of BaseCharacters (since each mint operation consumes one), and added a configurable loop to mint multiple ones. I added a bunch of console.time()/timeEnd() instrumentation, and enabled V8's profiling during the final loop, so we could get a flamegraph of most expensive pass.

diff --git a/agoric/contract/test/bootstrap.js b/agoric/contract/test/bootstrap.js
index 6607e265..08db083a 100644
--- a/agoric/contract/test/bootstrap.js
+++ b/agoric/contract/test/bootstrap.js
@@ -76,6 +76,16 @@ export const bootstrapContext = async (conf = undefined) => {
     minUncommonRating: 20,
   };

+  if (conf?.COUNT) {
+    const { COUNT } = conf;
+    const firstChar = defaultCharacters[0];
+    for (let i=0; i<COUNT-1; i++) {
+      const newChar = [firstChar[0]+i+1, {...firstChar[1]}];
+      defaultCharacters.push(newChar);
+    }
+    assert.equal(defaultCharacters.length, COUNT);
+  }
+
   // Start governed contract instance
   const { governorFacets } = await setUpGovernedContract(
     zoe,

To get results that are closer to how XS/xsnap would work, I made some additional changes:

--- /home/node/trees/agoric-sdk/packages/internal/src/storage-test-utils.js 2024-01-22 18:49:18.302358483 +0000
+++ ../node_modules/@agoric/internal/src/storage-test-utils.js  2024-02-01 23:20:29.845248835 +0000
@@ -206,8 +216,8 @@
 harden(makeFakeStorageKit);
 /** @typedef {ReturnType< typeof makeFakeStorageKit>} FakeStorageKit */

-export const makeMockChainStorageRoot = () => {
-  const { rootNode, data } = makeFakeStorageKit('mockChainStorageRoot');
+export const makeMockChainStorageRoot = (options={}) => {
+  const { rootNode, data } = makeFakeStorageKit('mockChainStorageRoot', options);
   return Far('mockChainStorage', {
     ...bindAllMethods(rootNode),
     /**

so I could have the KREAd test bootstrapjs override the default of sequence: true:

diff --git a/agoric/contract/test/bootstrap.js b/agoric/contract/test/bootstrap.js
index 6607e265..08db083a 100644
--- a/agoric/contract/test/bootstrap.js
+++ b/agoric/contract/test/bootstrap.js
@@ -55,7 +55,7 @@ export const bootstrapContext = async (conf = undefined) => {
   const installation = await E(zoe).install(contractBundle);
   const privateArgs = {
     powers: {
-      storageNode: makeMockChainStorageRoot().makeChildNode('thisElectorate'),
+      storageNode: makeMockChainStorageRoot({ sequence: false }).makeChildNode('thisElectorate'),
       marshaller: makeFakeBoard().getReadonlyMarshaller(),
     },
     clock: timerService.getClock(),

because sequence: true turns the O(1) chain-storage set() calls into O(N) append() ones. The real chain only remembers the latest value for each chain-storage cell, whereas storage-test-util.js was designed to support unit test cases that benefit from seeing all historical items too.

Both the KREAd code and agoric-sdk utilities use non-immediately-awaited Promises, which makes CPU profiling more difficult: if A triggers an expensive B, but does not await it before continuing on to C, the flamegraph may make it look like C is the source of the expense, not A+B. So I inserted a bunch of await calls which aren't strictly necessary for correctness, but helped tremendously for performance analysis. I also inserted stall() calls (which is just a loop that does await null twenty times in a row) in places where I suspected remaining overlap (code which doesn't await at all, where a synchronous function fires off an async call and doesn't give the caller anything to synchronize upon).

Note that liveslots flushes the VOM state-data cache at the end of every crank, to reduce GC-influenced nondeterminism. This requires us to re-load and re-unserialize the state data of any virtual object the first time it is referenced in any crank. In contract, the fakeLiveslots environment never ends a crank, so it never flushes this cache, so it never needs to unserialize state objects. This means the V8 benchmark run will lack some expensive deserialization time which actually exists on the mainnet (liveslots/XS) slog traces.

Finally, the V8 run has an expensive initial call to getRandomBaseIndex, which is an artifact of how the fakeLiveslots environment is built. This call produces a list of all 800-ish BaseCharacter keys, in order, so it can select a pseudo-random one (although, really, the keys are integers, so it ought to be able to do this in O(1) without enumeration). The enumeration performs a series of vatstoreGetNextKey() calls. On the real chain, where swing-store is backed by SQLite, this uses a normal DB index and is fast. In the fake environment, the vatstore is backed by a plain Map (which, being JS, enumerates in insertion order), plus a side Array to track the sorted order. The act of re-sorting that side array (too much) cause a perf hit that does not represent what the real chain would do. So we ignore that initial burst of activity when looking at the data.

Benchmark Results

The resulting flamegraph, for a single mint operation, looks like this:

screenshot annotated9

Eventually, I was able to correlate the long CPU times in the (V8 benchmark) flamegraph with the long gaps in the (real-chain XS) slogfile record, and deduce the sequence of events that took place. As a rough measure,

The real-chain XS operations (with 380 characters in the pooled purse) took about 95x the time needed by the V8 benchmark (with 800 characters). Compensating for the 380-vs-800, that makes the chain about 200x slower than the benchmark. This is a combination of my benchmark machine being faster than my follower, the XS engine being slower than V8 (no JIT), syscall overhead, and work done by XS that was not done by the benchmark (like deserializing the large purse on every crank).

How KREAd Works

First, some background on how the KREAd contract operates (as reverse-engineered by me; I may have it entirely wrong). It manages "Characters" and "Items", which can be owned and traded by users. Characters can also be "equipped" with Items, basically causing the item to be owned-by / associated-with the character, rather than a user. There are a limited number of Characters that can be minted (about 800 I think), of which about 380 have been minted so far. Minting a character costs a fee, which is split into "royalty" and "platform fee", and deposited into two depositFacets. The newly-minted character is automatically equipped with three Items. The equipped Items of each Character are held in that character's "Inventory".

At startup, the KREAd contract asks ZCF to create a zcfMint for two denominations (KREAd CHARACTER Brand and KREAd ITEM Brand). This causes v9-zoe to creates an ERTP Issuer for each, with a paymentLedger (mapping empty Payment objects to their balance), a recoverySet for those payments, and a mintRecoveryPurse to hold the recovery set for the mint-generated Payment objects. We can use the syscalls that manipulate these tables as a way to trace the execution inside v9-zoe from just the slog events.

Internally, each character's inventory is managed by a separate inventorySeat: a regular Zoe/ZCF Seat. This seat is created for each character when the character is minted, held internally by the contract (in a table indexed by the character name), and never exited or deleted.

Equipping an Item causes the item to be allocated to the inventorySeat, just like how fees are allocated to their recipients. ZCF/Zoe know about all the current allocations, but holds the assets in question in a zoe-side escrow purse until the seat is exited. Zoe creates a single shared localPooledPurse for each denomination, and merges all compatible amounts into that same purse, just like a regular bank pools all fungible customer deposits into a single vault and just uses record-keeping to track who owns what.

Users are only allowed to equip items they own, to characters they own. To demonstrate this authority safely, and to work around some limitations of Zoe (no wash trades), KREAd uses an unusual pattern. When initially minting the character, it actually creates two character Amounts, which I'll call the A character and the B character (think of this like the main actor and their nearly-interchangeable stunt double, or the two buffers in a double-buffered graphics rendering pipeline). These Amounts differ in only a single field: A gets { keyID: 1 } while B gets { keyID: 2 }. The user gets back A: it is allocated to the seat that triggered the mint() handler, which will be exited by the contract, and then the user will call getPayouts() and will get a Payment for character A, which they'll deposit into their own Purse. But character B is allocated to the newly-created inventorySeat, along with the three newly-minted Items. It sits there for an indefinite time (until an equip or unequip action occurs).

The protocol for equipping an item to a character requires the user to determine which A/B character Amount the user holds, vs which one is in the inventorySeat. If the user holds A, then the offer they submit looks like:

The contract the tells ZCF to transfer the inventorySeat's charB back to the user, and to transfer charA and the item into the inventorySeat. If the user held B, then these are reversed: the user gives character B and the item, and wants character A.

So now that 380 Characters have been minted on the chain, there are 380 "A" amounts, and 380 "B" amounts. For each pair, one of the two is owned by some user purse (held only by vat-wallet), and the other is allocated to a unique inventorySeat (held in the character table), but owned by the localPooledPurse in v9-zoe (held on behalf of the contract, to escrow all assets currently allocated to all open seats).

Note: if Zoe allowed wash trades (which we considered, but decided against because it's more likely to be an error than a legitimate use), the protocol could instead be "give A+Item, want A", and KREAd wouldn't need this A-vs-B duplication. Or, if we had better notions of "use objects" (eg @erights' recent "owned amounts" work), then equip() would be a method on an object only available to the current owner of the character, and perhaps wouldn't need a Zoe offer to execute.

Performance Issue

Anyways, the performance issue arises from these B characters and where they are held. The collection of open inventorySeats hold on to 380 character records, all of which are held by a single zoe-side localPooledPurse.

Zoe's escrow service was designed to be short-lived (as were seats in general: it makes sense to hold a seat open until a trade/offer is fulfilled, but they aren't meant as a long-term storage tool). And it was optimized for a moderate amount of things being escrowed. Zoe merges all escrowed assets of the same Issuer in a single localPooledPurse. For fungible tokens, this is an obvious performance win: adding two IST or BLD amounts is trivial, and the result requires only a single variable to track.

However for non-fungible tokens, the ERTP Amounts are copyBags. These nominally behave like a Map whose keys are character records and whose values are integers (i.e. 2 apples and 3 oranges gets you { apple: 2, orange: 3 }. (If the amounts are unique, it could use a copySet, which is like a copyBag where the values are always 1, but it wouldn't help here). The AmountMath code provides the same operations for these Amounts as it does with plain numbers: you can add two amounts, subtract them, query isGTE, etc. But these operations are a lot more complex than adding/subtracting/comparing numbers: they must do Set math, where "add" is really a union, and "isGTE" is really a form of intersection. So they take significantly longer, and this expense grows with the number of items in the Amount.

So when Zoe is told to escrow a newly-minted A character and allocate it to the user's seat, it must add this value into the existing localPooledPurse for Characters, which already has a B character for every Character that's ever been minted (380 on mainnet). In my benchmark run, on V8, when there are already 800 characters in this purse, it takes 70ms to perform this add.

Much of this time is spent sorting the entries into a stable order, and asserting that they are all of the right type, to meet the requirements of a copyBag (which exists because a normal Map cannot be serialized into durable storage, or used as a key in a table). There are a tremendous number of passStyleOf() calls in this portion of the flamegraph.

In addition, the subsequent updatePurseBalance requires Zoe to change the state.currentBalance of the localPooledPurse virtual object, which requires serializing this many-valued Amount. In the benchmark run, this needs 35ms just to check that the value matches the declared stateShape for the property in question, and then another 10ms to perform the serialization (our userspace marshalling code is not nearly as fast as a native JSON implementation).

All of this work is what consumes 137Mc in the zoe mintAndEscrow() delivery, and ends the first block. In the V8 benchmark run, we can see the mintAndEscrow() function takes 118ms, out of which add takes 70ms and updatePurseBalance takes 45ms (leaving 3ms for other small stuff).

mintAndEscrow

In the mainnet slog record, we cannot directly tell which function is being called when, but from the syscalls we can deduce some dividing points and correlations: The total delivery time for the first mintAndEscrow (the last delivery of the first block) was 10.051s, which is broken down into three big parts:

It must do this all again for the B character, taking about the same amount of time, which ends the second block. At this point, both A and B characters are in the pooled purse.

The third block is ended by the zoe.exitSeat() method spending similar time withdrawing the A character from the pooled purse, so it can be transferred to a Payment for the user. This does a subtract instead of an add, preceded by an isGTE() check that is not necessary in the deposit case.

withdraw

The fourth block contains the remaining work, like depositing the royalty/platform fees, which are fast because IST/BLD/etc is entirely fungible. On chain, this took just under a second (b13499085-r0 took 13.7Mc and 970ms).

(The V8 benchmark flamegraph has an artifact, the long deposit section at the end, which is where the benchmark test is depositing the A character into the test user's purse. This is slow because the test only has a single user, so that Purse already contains 800 characters. On chain, most users have only one character each, so their purses are small, and when the wallet deposits the character in the user's private Purse, the time to add/subtract/etc is insignificant).

Directions To Pursue

This leads to a number of interesting questions, and ways we might improve the situation. First, the concerns:

Then, potential fixes we might pursue.

And ways in which this should inform other efforts:

cc @Chris-Hibbert @erights @dckc

dckc commented 9 months ago
  • ERTP: replace the Purse with an NFT-specialized one that uses a MapStore ...
    • requires some sort of upgrade-Purse-in-place remediation scheme to be useful for the existing KREAd contract

I think it's plausible that we can make new purses and transfer the assets, without any in-place-upgrade magic.

Purses tend to be closely held. I think the walletFactory is the only importer of any purses:

earlier discussion: https://github.com/Agoric/agoric-sdk/pull/8861#issuecomment-1925409909

warner commented 9 months ago

https://github.com/agoric-labs/KREAd/tree/warner/8862-benchmarking has my benchmarking code in the last commit. cc @erights

warner commented 9 months ago
  • ERTP: replace the Purse with an NFT-specialized one that uses a MapStore ...

    • requires some sort of upgrade-Purse-in-place remediation scheme to be useful for the existing KREAd contract

I think it's plausible that we can make new purses and transfer the assets, without any in-place-upgrade magic.

Purses tend to be closely held. I think the walletFactory is the only importer of any purses:

  • vbank purses are fungible and short-lived, created afresh for each offer.
  • invitation purses and KREAd purses: walletFactory could voluntarily make new ones and transfer the assets.

earlier discussion: #8861 (comment)

Yeah, I think the only Purse with a non-trivial size is Zoe's internal localPooledPurse, which holds all the Characters that are escrowed for open seats. Because this issuer is a zcfMint, this Purse lives in vat-zoe, not vat-kread, and is never exposed outside that vat (although it's probably held in zoe's baggage too, for benefit of upgrade).

If the ERTP change is to implement a new more-efficient Kind for new Purses (noticing the AmountMath is non-fungible and using a MapStore for state.currentBalance), then on each upgrade (and for each zcfMint issuer), Zoe could ask ERTP for a new empty Purse, get one of the latest Kind, and drain the old one into it. O(numIssuers) work on each Zoe upgrade.

A wallet user who acquires a few hundred characters (or Items; I haven't investigated them very much) could have the same problem.. I'm not sure where to stand to have the wallet decide that it should ask the zoe-based Issuer for a replacement Purse.

dckc commented 9 months ago

I'm not sure where to stand to have the wallet decide that it should ask the zoe-based Issuer for a replacement Purse.

in an upgrade of walletFactory, during the 1st crank, much like...

https://github.com/Agoric/agoric-sdk/blob/121b67747aaa842adff618c610dbb0835d88cc33/packages/smart-wallet/src/walletFactory.js#L266-L269

Chris-Hibbert commented 9 months ago

If there's a desire to, we could make the KREAd contract a lot more performant.

We need to make large NFT purses a lot more efficient. The case of the Zoe escrow purses is particularly concerning