state-sync import does not repopulate historical transcript-span metadata

@mhofman did an experiment today, creating a state-sync snapshot export from a mainnet1B chain node (call it node A), then using that snapshot to populate a new node B. Then then compared the contents of a full swing-store export (not state-sync snapshot export) of node A against a similar export of node B.

He discovered that the second snapshot was lacking most of the historical transcript span metadata records (the ones that contain hashes of transcript entries). We were intending to keep these around in all nodes, even if we delete the spans themselves (the transcript entries, with deliveries and syscalls). With knowledge of the hashes, we could safely repopulate the span contents from downloaded/untrusted sources.

The IAVL tree holds the original export data (the "shadow table"), which includes all these records, so it's not that the data is missing entirely. It's just that it is only in the IAVL tree, not also in the swingstore sqlite DB where it's supposed to live (and where it would be useful to the maybe-some-day repopulation process).

We aren't sure where something went wrong. The swingstore importer has several different modes, and tests to exercise them all. But we haven't had a lot of luck with automating tests that involve creating a chain, and the validation of state-sync was mostly concerned with making sure the newly-created node was working properly (and these historical records aren't needed for normal operation, so most tests wouldn't notice them going missing). It's possible that the chain-side state-sync export process is somehow missing them, but @mhofman 's investigation suggests that it's more likely that swingstore is failing to add the incoming records to the DB, or failing to commit them, or something similar.

So the tasks for this ticket are:

study the state-sync import / swingstore import process, figure out where the metadata goes missing, and fix the problem
figure out why our tests didn't catch this, and fix those too
develop a process for repopulating the metadata in a live chain validator node

For the last task, we're thinking of adding a swingstore method with a name like fixMetadata or repopulateExportData. This method would either consume an iterator of export-data key/value pairs, or it would return an object that expects to be driven with a series of these pairs (plus a stop signal). The code would examine the pairs to ignore any that relate to the kvStore table: we do not want to touch those. But for pairs that relate to the snapshots or transcriptSpans tables, it should populate any missing entries. I think it should compare the incoming export-data pair against the sqlite row, bail if the data is already in sqlite but is somehow different (the tool should be strictly adding data, not modifying existing rows), ignore the pair if the data is already in sqlite and matches, and add the data to sqlite if it's missing entirely. Then the method should do a commit, or make it easy for the caller to do so.

The actual chain-side upgrade handler will then call this method, once, before the kernel controller is built (specifically before the swingstore is built). It will need to iterate the IAVL tree (the portion of the namespace dedicated to swingstore state-sync export data), feeding everything into the fixMetadata method, and ensuring it gets committed. Once complete, the swingstore should be intact again.

We probably don't want this fixMetadata process to emit new export-data to a callback (which might happen if we repopulated the records using a normal swingstore object), since the data will already be present in the IAVL side (the normal provider of that callback, and recipient of the updates).

Also note that the state-sync import process works by creating a (large) temporary file and sending the filename from the golang side to the JS side, rather than sending all the actual export-data records and artifacts through that bridge. The repopulation process maybe easier to implement if we re-use that same logic, at least on the golang side.

On the export side, chain-main.js calls spawnSwingStoreExport without an exportMode, so the argv array it creates will lack an --export-mode=, so the same file (but in a child process) will get exportMode = undefined, so initiateSwingStoreExport will get the same, so makeExporter (which is really makeSwingStoreExporter) will default to exportMode='current', which includes all metadata (and merely omits the old artifacts).

So I believe the state-sync snapshots contain the expected metadata, but only current artifacts (no historical ones).

Later, the import side is called when chain-main.js calls performStateSyncImport without the includeHistorical option, so importDB (which is really importSwingStore) is called with includeHistorical: undefined, which allows importSwingStore to default to false.

Here, the implementation of importSwingStore unfortunately depends too much on the includeHistorical mode. Each export-data (metadata) record is examined to see which subsystem owns it: kvStore records are stored immediately, but bundle/transcript/snapshot records are parsed and held in RAM for later:

https://github.com/Agoric/agoric-sdk/blob/6e5b422b80e47c4dac151404f43faea5ab41e9b0/packages/swing-store/src/swingStore.js#L1015-L1034

This artifactMetadata Map is never enumerated, and it is only consulted if 1: the importer knows about a vat (so it looks for the current transcript span, and possibly a heap snapshot), or 2: includeHistorical is true and the getArtifactNames() query yields a matching artifact name.

The metadata keys are given to the importer, but they're only stored in the DB if there is a matching artifact (which there isn't, because we did exportMode='current' instead of archival ), and if the importer is looking for them (which it isn't, because includeHistorical: false).

Fixing Import To Not Omit Metadata / Export-Data Keys

To fix this, we need to rewrite the importer to store every metadata key that it receives in the export-data, regardless of includeHistorical and/or the presence of a matching artifact. This will require some new code in snapStore.js, which only has a SQL query to insert a row with a full snapshot, which of course we do not have yet.

Then, after all the metadata is stored, the importer should walk the list of known vats, and import the current transcript (and possible snapshot) for each, as it does now. These are mandatory artifacts, so it happens without calling getArtifactNames (and the import fails if any are missing).

After that, it should call getArtifactNames to find all the optional artifacts, skipping the ones it already imported, and validating the rest. This is the portion that includeHistorical: false could omit, but I'm not entirely convinced that this option is useful: I think the importer should take everything it's given, and leave any pruning up to the exporter.

I have a new set of unit tests in mind to submit an explicit set of metadata and artifacts, and then examine the swingstore afterwards to ensure that all the metadata was recorded, whether or not the optional artifacts were included.

Repopulating (Accidentally-Omitted) Metadata / Export-Data Keys

Once that is fixed, to fix the mainnet chain nodes that were launched from a state-sync snapshot (and are now missing the historical metadata), we need some new swingstore APIs. The main one will be named repopulateExportData:

this will be a hostStorage method on a swingstore instance, to give the host control over the commit point
hostStorage.repopulateExportData(exporter) will take the same exporter as importSwingStore
it will call exporter.getExportData(), but will never call getArtifactNames() or getArtifact()
for every export-data key that already exists in the receiving swingstore, it will assert that provided value matches the swingstore contents
- the result Promise of repopulateExportData will reject upon any mismatch
all new keys will be added to the receiving swingstore
this API relies on the provided exporter to be truthful about the export-data contents: a malicious data provider can inject arbitrary falsehoods into the receiving swingstore

Then a chain upgrade handler can walk the IAVL tree for previously-exported metadata keys, and submit all of them to repopulateExportData(). If that completes successfully, the swingstore will be complete once again.

At this point in the upgrade handler, our plan is to have the host app use makeSwingStoreExporter() to retrieve all of the metadata keys and move them to a new portion of the IAVL tree. However, that uses a separate DB connection, and thus depends upon the data being committed, and I know @mhofman was hoping to have just a single commit point for the whole chain upgrade handler. The repopulateExportData process is idempotent, however, so I think it'd be ok to hostStorage.commit() the swingstore at that point (in addition to a later commit when the first post-upgrade block is finished).

Repopulating (Previously-Pruned) Artifacts

While doing this work, I'll be refactoring the swingstore import code, so it's a good time to at least plan for a second API, which will be used in the future to repopulate pruned artifacts like old transcript spans. To this end, we sketched out repopulateArtifacts(). This would take a swingstore dbPath (like importSwingStore), and is meant to be run from a standalone process (drawing from a zipfile that contains the artifacts to be imported).

The idea is:

some validators prune old transcript spans, to save space, and because they aren't needed for operational purposes
but oops, then we discover that we need them to perform some kind of replay upgrade
- e.g. a new version of xsnap, not snapshot-compatible with the original
- so we need to replay the whole latest incarnation of each vat
we prepare a zipfile of all latest-incarnation spans for each vat, and include the repopulation tool in agoric-sdk
we tell validators to download it, and then run repopulate-swingstore.js ~/.agoric/data/agoric/swingstore.sqlite ZIPFILE
- they don't trust us, of course: the pre-existing in-consensus metadata/export-data keys are used to check hashes of the zipfile contents before inserting them into the DB
once complete, their DB should have enough information to perform the replays
then we ship the chain software upgrade that includes the new xsnap and performs the replays at startup
- (or use @mhofman's trick to pre-compute as much as possible)

The API will look like:

repopulateArtifacts(dbPath, exporter, { toLevel })
this is used to add missing artifact contents to the receiving swingstore
it will not call exporter.getExportData(), nor will it modify any existing metadata
it does not rely upon the data provider for integrity of the provided artifacts
the repopulator API will call exporter.getArtifactNames(), and then exporter.getArtifact(name) for everything it returns whose data is not already in the swingstore
each resulting artifact will be validated against the existing swingstore metadata, then added to the DB
on error, the repopulation process is abandoned, the swingstore is left unmodified (the DB commit is aborted), and the repopulateArtifacts() return Promise is rejected. This occurs if:
- if any artifact fails validation
- if, after all getArtifactNames()-reported artifacts are successfully incorporated, the toLevel goal is not met (e.g. a necessary artifact is still missing)
- if getArtifact() fails to provide a requested artifact
the toLevel argument indicates how complete we want the final swingstore to be
- operational (the default) ensures that we have the current transcript span, and a snapshot for any vat whose current span does not start at delivery 0
- current-incarnation additionally ensures we have all transcript spans for the current incarnation of each vat, sufficient to perform a replay-style upgrade of the current incarnation (from the most recent E(vatAdminFacet).upgrade() point, e.g. to use a less-compatible version of xsnap)
- (I suggested that we could use
- archival additionally ensures we have all transcript spans for all incarnations, sufficient to perform a severe replay-style upgrade from the very first incarnation
- a notional debug level exists, which additionally includes all old heap snapshots, but this API does not use or accept it: you can export artifacts at a debug level, and repopulateArtifacts will include everything you give it, but we don't provide a way to assert that
- the only consequence of setting an elevated toLevel is to sometimes signal an error
- all swingstores should already have enough data to meet toLevel: 'operational': we don't prune the important stuff
if no errors occur, the DB is committed and the process exits with a success code
then we create a CLI tool that has --to-level= and knows where the swingstore.sqlite file is located

This will require some new code in snapStore.js, which only has a SQL query to insert a row with a full snapshot, which of course we do not have yet.

I'd like if we could consider landing https://github.com/Agoric/agoric-sdk/pull/7635 since it has significant and partially overlapping changes in that area already.

this will be a hostStorage method on a swingstore instance, to give the host control over the commit point

hostStorage.repopulateExportData(exporter) will take the same exporter as importSwingStore

I am worried about tackling this on the regular hostStorage API. The method has to be async given the shape of the exporter, and really during this repopulation, nothing else should touch the swing-store. Why can't this be a top level function that returns the swing-store after repopulation is complete, like importSwingStore ?

The main one will be named repopulateExportData

We only care about the artifact metadata in this export data and shouldn't import the kvStore data part of it. How about repopulateArtifactMetadata. The exporter shouldn't have understanding about the shape of keys in the export data.

I know @mhofman was hoping to have just a single commit point for the whole chain upgrade handler. The repopulateExportData process is idempotent, however, so I think it'd be ok to hostStorage.commit()

I was never planning on a single commit point. As you say, this is idempotent, so safe to commit early.

then we ship the chain software upgrade that includes the new xsnap and performs the replays at startup

(or use @mhofman's trick to pre-compute as much as possible)

So this brings an interesting use case: import artifacts and the related metadata, overriding current content. This makes me kinda want to go back to a single repopulateArtifacts with the following signature and semantics:

/**
 * @param {string} dbDir Path to swing-store DB
 * @param {SwingStoreExporter} exporter the exporter containing artifact information to use
 * @param {object} [options]
 * @param {boolean} [options.overwriteExisting=false] Whether to overwrite existing artifacts and their metadata
 * @param {'operational' | 'latest' | 'archival' | 'debug'} [options.toLevel='operational']
 */
function repopulateArtifacts(dbDir, exporter, options) {}

repopulateArtifacts would always call getExportData and always verify the toLevel consistency

If the exportData is empty, no metadata is updated
For each metadata in exportData
- if matches existing metadata DB entry, continue
- if exiting is empty, add and continue
- if overwriteExisting === false throw
- update metadata, remove previous artifact (as it no longer matches the metadata)
According to the level requested, compute the list of artifacts names for missing artifacts. This list may be empty
Request each artifact and verify it against the stored (and possibly previously updated) metadata
If the requested level is debug, call getArtifactNames on the exporter, and import any artifact that are not already present (verifying the artifact as always)

I believe the above method would cover all 3 use cases we have:

Repopulate missing artifact metadata. Will be called with default options (no overwrite and to operational level), with exporter providing exportData and no artifacts (since no new artifacts are needed for operational level)
Repopulate transcript artifacts for current incarnation. Will be called with toLevel: 'latest' option (default no overwrite), and exporter providing no exportData and all potentially needed transcripts artifacts
Update swing-store with new XS artifacts. Will be called with overwriteExisting: true options (default operational toLevel), and exporter providing exportData for new transcript and snapshot metadata, and artifact data for the "inUse" transcript span and corresponding snapshot of each imported vat.

For now we can leave the overwrite part out, but basically I am no longer convinced that the 2 kind of imports artifact use cases we have today are sufficiently different to warrant 2 different methods.

This will require some new code in snapStore.js, which only has a SQL query to insert a row with a full snapshot, which of course we do not have yet.

I'd like if we could consider landing #7635 since it has significant and partially overlapping changes in that area already.

I'll read up on that.

this will be a hostStorage method on a swingstore instance, to give the host control over the commit point

hostStorage.repopulateExportData(exporter) will take the same exporter as importSwingStore

I am worried about tackling this on the regular hostStorage API. The method has to be async given the shape of the exporter, and really during this repopulation, nothing else should touch the swing-store. Why can't this be a top level function that returns the swing-store after repopulation is complete, like importSwingStore ?

I don't object to that plan, but I was thinking about how the cosmic-swingset upgrade path will be calling this. I was assuming that the upgrade handler code would have already opened the DB (holding a hostStorage), and needs to mutate it in-place, and then keep using that same handle, so something like:

import { openSwingStore } from '@agoric/swing-store';

const { hostStorage, kernelStorage } = openSwingStore(dbPath);
await doRepopulation(hostStorage, IAVLStuff);
await buildController(kernelStorage);

If, instead, the upgrade handler is likely to get control well before the swingstore is opened "for real", then I'd make it:

import { repopulateExportData, openSwingStore } from '@agoric/swing-store';

await doRepopulation(dbPath, repopulateExportData, IAVLStuff);

const { hostStorage, kernelStorage } = openSwingStore(dbPath);
await buildController(kernelStorage);

I don't think I'd have repopulateExportData return anything: it would do the final commit() internally, and its job is then done. Acquiring a hostStorage/kernelStorage pair is then up to the caller to do separately.

The main one will be named repopulateExportData

We only care about the artifact metadata in this export data and shouldn't import the kvStore data part of it. How about repopulateArtifactMetadata. The exporter shouldn't have understanding about the shape of keys in the export data.

Yeah, I'm ok with that. Agreed that the exporter should not be parsing keys. So the change to the API description is that any new or different kvStore shadow entries cause an error+abort.

then we ship the chain software upgrade that includes the new xsnap and performs the replays at startup

(or use @mhofman's trick to pre-compute as much as possible)

So this brings an interesting use case: import artifacts and the related metadata, overriding current content. This makes me kinda want to go back to a single repopulateArtifacts with the following signature and semantics:

For now we can leave the overwrite part out, but basically I am no longer convinced that the 2 kind of imports artifact use cases we have today are sufficiently different to warrant 2 different methods.

I'm gonna disagree. My biggest reason is that any metadata import is an act of trust/reliance/security. I really prefer having a strong separation between APIs which allow the data provider to compromise integrity, vs those which do not. Having an option (even with a safe default) to enable/disable this doesn't feel visible enough. And, having consensus-sensitive behavior driven by an option is how we got into this state.

My second reason is that precompute is far enough out that we shouldn't compromise the API to accomodate it yet (I'm not convinced that we understand the needs well enough to get the API right yet). If I squint, I can imagine that the precompute step will be run as a bunch of separate processes, one per vat, and they follow the deliveries made on the real kernel (maybe they snoop the real swingstore), and they accumulate state in a set of pre-compute DBs (one per vat), and then when we finally trigger the upload, there's a merge step. It has to delete the current-incarnation transcript spans and replace them from the precompute DB, and provide a heap snapashot, but what else might change? It's going to depend upon how much XS variation we're willing to tolerate. If GC is different then we might be talking about reconciling c-lists or vatstorage between the two sides, which means some per-vat kvstore keys are deleted or changed, which goes way beyond discrete artifacts.

So I want to stick with two APIs, one "trusting" (integrity relies on the data provider), one "safe" (data provider cannot violate integrity).

Also, looking at your proposed API, it makes me realize that we're abusing the behavior of exporter as a sort of additional options bag. If my API description for the "trusting" API states that it won't call exporter.getArtifactNames or exporter.getArtifact, then maybe it shouldn't actually take an Exporter: maybe it should only take a function with the same signature as exporter.getArtifactData. The simplest way to not violate a calling convention is to delete it.

Then the "safe" API would take { getArtifactNames, getArtifactData }, and not the full Exporter type.

The net result would be:

async function repopulateArtifactMetadata(dbPath, getExportData)
async function repopulateArtifacts(dbPath, { getArtifactNames, getArtifactData }, { toLevel='operational' })

both of which open their own DB connection, do their work, then either commit+fulfill or abort+reject.

And, repopulateArtifacts is lower-priority, and out-of-scope for this ticket. I'll keep it in mind when doing the refactoring, and maybe have a branch where it's implemented, but I don't want to hold up the primary feature on it (or on building tests for it, which is absolutely required before that part lands).

I was assuming that the upgrade handler code would have already opened the DB (holding a hostStorage), and needs to mutate it in-place, and then keep using that same handle

I am actively refactoring the init logic to assert that init hadn't happened yet in these cases. Will also parametrize init to indicate if it's a genesis init or not.

I already added a check on the cosmic-swingset side in https://github.com/Agoric/agoric-sdk/pull/8043

I don't think I'd have repopulateExportData return anything: it would do the final commit() internally, and its job is then done. Acquiring a hostStorage/kernelStorage pair is then up to the caller to do separately.

I'd prefer to keep the "import" API consistent.

I don't think I'd have repopulateExportData return anything: it would do the final commit() internally, and its job is then done. Acquiring a hostStorage/kernelStorage pair is then up to the caller to do separately.

I'd prefer to keep the "import" API consistent.

Would that mean repopulateExportData returns the same objects as openSwingStore? That seems odd to me.. the repopulation code doesn't need to know what the user-facing kernelStorage/hostStorage API is. And isn't this going to be called in an upgrade handler which already has code to call openSwingStore() in the non-upgrade case? Is the upgrade handler going to provide hostStorage to the non-upgrade path somehow?

I'll start working on the new API, but I'll check in with you on this question before making the PR, to make sure it lines up with your code on the cosmic-swingset side.

Agoric / agoric-sdk