Suspected liveslots sensitivity to engine allocations

mhofman commented 1 year ago

Describe the bug

While validating an upgrade to Moddable SDK 3.6.0 from the fork we've been running, I attempted to validate that an upgraded XS version could still execute transcripts of mainnet vats.

Liveslots is supposed to hide any gc effect from the user code, and hide organic gc from the kernel (as captured by transcripts), only revealing the result of a full gc in a deterministic manner when explicitly requested by bringOutYourDead. As such if the JavaScript behavior of the user program is the same, the transcript should be the same.

While we're aware of and are able to work around a gc sensitivity in virtual collections, something is causing other execution differences.

I've narrowed down the triggers of those differences to 5 commits in the Moddable SDK, which have all been confirmed to have an impact on allocations.

To Reproduce

Download and extract:
Checkout mhofman/update-xs-sdk-profiler-modulesource-fuzzilli38-reverts
- This is a branch of release-pismo with the Moddable SDK updated to 3.6.0 and select reverts on top to make all transcripts run as they initially did.
undo any of the reverts from the next paragraph in the moddable submodule to trigger the corresponding error. Please note that yarn build in packages/xsnap automatically restores the moddable submodule commit, unless it's added to the index. In packages/xsnap/moddable:
- git reset --hard mhofman/sync-sdk-keep-behavior
- git revert $commitsha
- (cd .. && git add moddable && yarn build)
Execute /path/to/agoric-sdk/packages/SwingSet/misc-tools/replay-transcript.js --no-forced-reload-from-snapshot transcript-v$i.sst

Observed behaviors

Undo of Revert "XS: fuzzilli 38"

anachrophobia strikes vat v24
expected: {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":180,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
during transcript num= 1547 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v25
expected: {"0":"vatstoreGet","1":"vom.o+10/3","length":2}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/409","length":2}
during transcript num= 3821 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v26
expected: {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
got     : {"0":"vatstoreSet","1":"vom.o+11/63","2":"{\"proposal\":{\"body\":\"{\\\"exit\\\":{\\\"onDemand\\\":null},\\\"give\\\":{\\\"In\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDT_axl brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"1000000\\\"}}},\\\"want\\\":{\\\"Out\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: IST brand\\\",\\\"index\\\":1},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"1000000\\\"}}}}\",\"slots\":[\"o-59\",\"o-60\"]}}","length":3}
during transcript num= 596 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v27
expected: {"0":"vatstoreSet","1":"vom.o+11/317","2":"{\"proposal\":{\"body\":\"{\\\"exit\\\":{\\\"onDemand\\\":null},\\\"give\\\":{\\\"In\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDT_grv brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"384758483\\\"}}},\\\"want\\\":{\\\"Out\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: IST brand\\\",\\\"index\\\":1},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"384758483\\\"}}}}\",\"slots\":[\"o-59\",\"o-60\"]}}","length":3}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/1","length":2}
during transcript num= 2968 for worker PID XXXX (start delivery 0)

Undo of Partial revert "XS: StaticModuleRecord needsImport, needsImportMeta"

anachrophobia strikes vat v25
expected: {"0":"vatstoreGet","1":"vom.o+10/3","length":2}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/409","length":2}
during transcript num= 3821 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v26
expected: {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
got     : {"0":"vatstoreSet","1":"vom.o+11/63","2":"{\"proposal\":{\"body\":\"{\\\"exit\\\":{\\\"onDemand\\\":null},\\\"give\\\":{\\\"In\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDT_axl brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"1000000\\\"}}},\\\"want\\\":{\\\"Out\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: IST brand\\\",\\\"index\\\":1},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"1000000\\\"}}}}\",\"slots\":[\"o-59\",\"o-60\"]}}","length":3}
during transcript num= 596 for worker PID XXXX (start delivery 0)

Undo of Revert "XS: compartment proposal update"

anachrophobia strikes vat v24
expected: {"0":"vatstoreGet","1":"vom.o+10/2","length":2}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
during transcript num= 7903 for worker PID XXXX (start delivery 0)

Undo of both ModuleSource / VirtualModuleRecord related reverts, aka checkout https://github.com/agoric-labs/moddable/commit/3a2c047f953b6b929295efa78cba8064d8cd5da3

anachrophobia strikes vat v26
expected: {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
got     : {"0":"vatstoreSet","1":"vom.o+11/63","2":"{\"proposal\":{\"body\":\"{\\\"exit\\\":{\\\"onDemand\\\":null},\\\"give\\\":{\\\"In\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDT_axl brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"1000000\\\"}}},\\\"want\\\":{\\\"Out\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: IST brand\\\",\\\"index\\\":1},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"1000000\\\"}}}}\",\"slots\":[\"o-59\",\"o-60\"]}}","length":3}
during transcript num= 596 for worker PID XXXX (start delivery 0)

Undo of Partial revert "XS: profiler Linux Windows"

anachrophobia strikes vat v10
expected: {"0":"vatstoreSet","1":"vom.o+16/74","2":"{}","length":3}
got     : {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDC_axl payment\",\"nextInstanceID\":77,\"unfaceted\":true}","length":3}
during transcript num= 229 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v11
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDC_grv payment\",\"nextInstanceID\":119,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+17/100","2":"{\"currentBalance\":{\"body\":\"{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDC_grv brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"0\\\"}}\",\"slots\":[\"o+15/1\"]},\"recoverySet\":{\"body\":\"{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: setStore\\\",\\\"index\\\":0}\",\"slots\":[\"o+8/115\"]}}","length":3}
during transcript num= 693 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v12
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDT_axl payment\",\"nextInstanceID\":18,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+16/16","2":"{}","length":3}
during transcript num= 322 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v13
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDT_grv payment\",\"nextInstanceID\":97,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+17/134","2":"{\"currentBalance\":{\"body\":\"{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDT_grv brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"0\\\"}}\",\"slots\":[\"o+15/1\"]},\"recoverySet\":{\"body\":\"{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: setStore\\\",\\\"index\\\":0}\",\"slots\":[\"o+8/149\"]}}","length":3}
during transcript num= 547 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v24
expected: {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
got     : {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":16,\"unfaceted\":true}","length":3}
during transcript num= 154 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v25
expected: {"0":"vatstoreGet","1":"vom.o+10/2","length":2}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
during transcript num= 67 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v26
expected: {"0":"vatstoreGet","1":"vom.o+10/2","length":2}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
during transcript num= 79 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v27
expected: {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":13,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
during transcript num= 127 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v42
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"DAI_axl payment\",\"nextInstanceID\":5,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+16/3","2":"{}","length":3}
during transcript num= 853 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v46
expected: {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
got     : {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":10,\"unfaceted\":true}","length":3}
during transcript num= 112 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v47
expected: {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":20,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
during transcript num= 194 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v6
expected: {"0":"vatstoreGet","1":"vom.rc.o+17/1257","length":2}
got     : {"0":"vatstoreGet","1":"vom.rc.o+12/2","length":2}
during transcript num= 29548 for worker PID XXXX (start delivery 0)

Undo of Revert "XS profile ID by function definition"

anachrophobia strikes vat v10
expected: {"0":"vatstoreSet","1":"vom.o+16/59","2":"{}","length":3}
got     : {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDC_axl payment\",\"nextInstanceID\":62,\"unfaceted\":true}","length":3}
during transcript num= 194 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v11
expected: {"0":"vatstoreSet","1":"vom.o+16/119","2":"{}","length":3}
got     : {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDC_grv payment\",\"nextInstanceID\":122,\"unfaceted\":true}","length":3}
during transcript num= 700 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v12
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDT_axl payment\",\"nextInstanceID\":18,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+16/16","2":"{}","length":3}
during transcript num= 322 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v13
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"USDT_grv payment\",\"nextInstanceID\":40,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+17/100","2":"{\"currentBalance\":{\"body\":\"{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDT_grv brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"0\\\"}}\",\"slots\":[\"o+15/1\"]},\"recoverySet\":{\"body\":\"{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: setStore\\\",\\\"index\\\":0}\",\"slots\":[\"o+8/115\"]}}","length":3}
during transcript num= 324 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v24
expected: {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
got     : {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":16,\"unfaceted\":true}","length":3}
during transcript num= 154 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v25
expected: {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
got     : {"0":"vatstoreSet","1":"vom.o+11/11","2":"{\"proposal\":{\"body\":\"{\\\"exit\\\":{\\\"onDemand\\\":null},\\\"give\\\":{\\\"In\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: USDC_grv brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"4000000\\\"}}},\\\"want\\\":{\\\"Out\\\":{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: IST brand\\\",\\\"index\\\":1},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"4000000\\\"}}}}\",\"slots\":[\"o-59\",\"o-60\"]}}","length":3}
during transcript num= 107 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v26
expected: {"0":"vatstoreGet","1":"vom.o+10/2","length":2}
got     : {"0":"vatstoreGet","1":"vc.7.|o+11/2","length":2}
during transcript num= 79 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v27
expected: {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":13,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
during transcript num= 127 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v42
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"DAI_axl payment\",\"nextInstanceID\":5,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+16/3","2":"{}","length":3}
during transcript num= 853 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v43
expected: {"0":"vatstoreSet","1":"vom.dkind.16","2":"{\"kindID\":\"16\",\"tag\":\"DAI_grv payment\",\"nextInstanceID\":26,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreSet","1":"vom.o+17/415","2":"{\"currentBalance\":{\"body\":\"{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: DAI_grv brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"0\\\"}}\",\"slots\":[\"o+15/1\"]},\"recoverySet\":{\"body\":\"{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: setStore\\\",\\\"index\\\":0}\",\"slots\":[\"o+8/430\"]}}","length":3}
during transcript num= 972 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v46
expected: {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
got     : {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":10,\"unfaceted\":true}","length":3}
during transcript num= 112 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v47
expected: {"0":"vatstoreSet","1":"vom.dkind.12","2":"{\"kindID\":\"12\",\"tag\":\"InvitationHandle\",\"nextInstanceID\":20,\"unfaceted\":true}","length":3}
got     : {"0":"vatstoreGet","1":"vom.o+19/1","length":2}
during transcript num= 194 for worker PID XXXX (start delivery 0)
--
anachrophobia strikes vat v6
expected: {"0":"vatstoreGet","1":"vom.rc.o+12/3","length":2}
got     : {"0":"vatstoreGet","1":"vom.rc.o+17/275","length":2}
during transcript num= 12258 for worker PID XXXX (start delivery 0)

(partial results)

warner commented 1 year ago

I looked into a few of these, here are some preliminary notes.

The first one ("XS: fuzzilli 38", vat v24, tnum=1547) is a makeWantMintedInvitation to o+19/1, in which the "expected" behavior has syscalls which allocate a new InvitationHandle (instance of durable Kind 12), but the "got" is a state fetch for o+19/1 (the target object). This indicates that liveslots had no Representative or state for the target of the message (I guess the GC variance caused the Representative to get collected early), and when makeWantMintedInvitation needs the state, it had to be paged in.

@FUDCo and I considered this during design/implementation, but it looks like we missed something. Our intention was that the Representative can be built without doing a vatstore read (the Kind ID is parsed out of the vref), so simply delivering a message to a vobj shouldn't trigger a syscall (or divergence if the Rep is/is-not currently present in the slotToVal table's WeakRef). The state object is built at the same time, but we don't need the actual data until (and unless) the method being invoked tries to do a property lookup, so simple behavior-less marker objects wouldn't need a syscall either. Only when the setter/getter fires, that's when we fetch from the vatstore. And we made it to perform that fetch unconditionally (even if the data is available in our LRU cache), specifically to make the syscalls a deterministic function of userspace behavior (knowing that the reads were redundant: the cache doesn't help).

Hm, I wonder if the write-back cache is involved: if userspace does:

1: read
2: write
.. time passes, GC might happen
3: read

then the 3-read must use the dirty data from the cache, for correctness, but we thought we were doing an unconditional read from the vatstore, for determinism. I don't remember considering the conflict between those two, which means I'm not confident that it does the right thing.

(@FUDCo and I have talked about removing that LRU cache entirely, or making it write-through, in #6693, partly because of how it confuses issues like this)

The last one ("XS profile ID by function definition", vat v6, tnum=12258) is a BringOutYourDead, which (in the transcripts provided) starts with a pair of vatstoreSet calls that flush the dirty contents of the virtual-object LRU cache.

Basically the vatstoreSet of a vom.o+NN/NN key is a LRU cache write, the vatstoreSet of a vom.dking.NN key is incrementing the nextInstanceID counter because a new virtual/durable object has just been allocated, and the vatstoreGet of a vom.rc.o+NN/NN key is checking the virtual reference counts for an object that has presumably just lost its RAM pillar and GC is deciding whether to delete it entirely.

I concur that we have more sensitivity to GC order than we first thought (i.e. #6360 and #6760 do not form an exhaustive list).

mhofman commented 1 year ago

Not sure if it can help, but I have generated new vat transcripts replaying chain activity using an updated version of XS (with different allocation behavior). Here is the diff vs the vat transcripts for the pismoC version of XS (as if that fixed version had been used from agd upgrade). I have the full transcripts if needed as well.

transcript-fixed-vs-updated-diff.tgz

Generated using

for i in {1..54}; do diff -U 5 <(zcat replay-fixed/transcript-v$i.sst.gz | jq -r .) <(cat replay-updated-xs-from-transcript/transcript-v$i.sst | head -n $(zcat replay-fixed/transcript-v$i.sst.gz | wc -l) | jq -r .) > transcript-fixed-vs-updated-v$i.diff ; done

warner commented 1 year ago

We'll use this ticket to track the sensitivies we discover, but we may or may not be able to fix all of them in time for the bulldozer release.

mhofman commented 1 year ago

Update: Moddable landed a patch for us which considers WeakRef strong during organic GC and only empties them during forced GC (which happens either during BYOD or snapshot making, which are both in consensus operations).

That effectively hides organic GC from liveslots, and thus from the kernel. That patch seem to have removed all the anachrophobia reported above. Testing methodology: new vat transcripts generated using chain-replay tool on pismoC + WeakRef GC patch cherry picked (see mhofman/6784-hide-organic-gc-pismo), replay transcripts on latest XS which includes the same WeakRef GC patch (see mhofman/6784-hide-organic-gc-updated-xs).

This should allow us to confidently update XS versions which contain allocation differences without triggering replay issues during node restart. Of course we should still fix the underlying issues in liveslots so we can disable this patch.

mhofman commented 1 year ago

~~One possible manifestation may be in https://github.com/Agoric/agoric-sdk/issues/7078, even though that issue seems fully non-deterministic, or at least a race between BOYD and vat termination~~

Edit: that issue was happening in a node based worker, explaining the full non-determinism.

warner commented 2 weeks ago

Not sure if it's relevant, but leaving a pointer just in case: I found that liveslots scanForDeadObjects() was not sorting a list of finalizer-gathered vrefs (the possiblyDeadSet) before processing it. That caused the order of refcount-checking syscall.vatstoreGet done during BOYD to depend on the order in which the engine decided to invoke the finalizer, which is something we had intended to hide. PR #9961 will fix the misbehavior by sorting the list of vrefs before doing the refcount checks.

Agoric / agoric-sdk