Client-VM Connector API

Overview

Spacemesh maintains a strict abstraction boundary between node and VM. The node is responsible for networking, database, consensus, etc., but not for the contents, interpretation, or execution of transactions, nor for managing account state. The VM is responsible for the interpretation and execution of transactions and for managing account state.

This proposal specifies the API that a Spacemesh node should use to exchange data with the VM.

Goals and motivation

Maintaining a strict abstraction boundary between node and VM has multiple benefits:

it makes the VM modular, allowing any Spacemesh full node implementation to utilize any VM implementation
it allows the use of multiple VMs
it allows clean separation of concerns, as the content, interpretation, and execution of transactions, and management of account state, is handled completely inside the VM
it facilitates the creation of a separate validator protocol in future

The goal is to design the simplest possible API that facilitates this functionality. Note that, while the API is specified in golang in this proposal, it should be easy to translate into C types and functions for use over FFI and it should be easy to implement it in other languages.

High-level design

Under the account unification model (#49), the Spacemesh node is only responsible for low-level functionality such as networking, database, and consensus. The interpretation and execution of transactions and smart contracts, and management of account state, is handled entirely inside a modular VM. From the perspective of a node, a transaction is just an opaque byte array. The node therefore calls into the VM module to verify and execute transactions. When the node needs to be aware of some data contained in a transaction, such as the principal, nonce, or max gas (for purposes of selecting and ordering transactions), the VM layer unpacks these data and passes it back to the node.

Prior art

https://evmc.ethereum.org/

Proposed implementation

// Initializes the balance of the genesis accounts in the genesisConfig map,
// deploys the set of genesis templates (hardcoded in the VM).
// Note that genesis accounts do NOT need a spawned account at
// genesis since they can use self-spawn later.
// VM does not allow verification or processing of transactions until this has been
// run. It sets the current layer to zero (genesis layer).
func (vm *VM) ApplyGenesis(genesisConfig map[types.Address]uint64) error

// Finalizes the previous layer and initializes a new layer. Rewards are applied
// immediately at the start of the new layer.
// Returns an error if layerID != the VM's opinion of the next layer.
func (vm *VM) NewLayer(layerID types.LayerID, rewards map[types.Address]uint64) error

// Returns the state root of the specified layer, or an error if the layer is unknown (i.e., too new).
// This may be a finalized state root (for a previous layer) or an intermediate state root
// (for the current top layer).
func (vm *VM) GetLayerStateRoot(layerID types.LayerID) (types.Hash32, error)

// Used in the rare case where full self healing requires that we revert state (and re-apply
// from a previous layer).
// Returns an error if the rewind fails, or if layerID >= the current top layer.
func (vm *VM) Rewind(layerID types.LayerID) (types.Hash32, error)

// Does syntactic validation, then runs verify() inside the VM.
// Returns true if verify succeeded, false if it failed (i.e., if the transaction
// cannot pay gas to cover the basic cost of verification and the transaction
// is therefore ineffective).
// Also returns the transaction principal account.
// Return error if the transaction is syntactically invalid.
func (vm *VM) VerifyTransaction(txdata []byte) (bool, types.Address, error)

// Compares the ordering of two transactions from the same principal (based on nonce).
// Returns 1 if tx A should come first, -1 if tx B should come first, 0 if there is no ordering,
// and an error if either tx is syntactically invalid or if they do not have the same principal.
func (vm *VM) Compare(txdataA []byte, txdataB []byte) (int, error)

// Executes the transaction against the current state and returns a receipt.
// Called as part of optimistic transaction filtering, and when applying blocks from verified layers
// based on hare and/or tortoise output.
// Note that the transaction is an opaque byte array.
// Return error if the transaction is syntactically invalid.
func (vm *VM) ExecuteTransaction(txdata []byte) (types.Receipt, error)

// Some basic getters, used in the node API to read account data.
// No need for an error, these just return null if the account doesn't exist.
func (svm *SVM) GetBalance(addr types.Address) uint64
func (svm *SVM) GetNonce(addr types.Address) uint64
func (svm *SVM) GetTemplate(addr types.Address) types.Address

Note that the design explicitly excludes several things (some of which are implemented in the svm module in go-spacemesh today, and will need to be moved elsewhere):

ApplyLayer(): the node should call NewLayer() (with layer rewards), followed by ExecuteTransaction() for each transaction in sequence
GetLayerApplied(txid): the VM does not store this information (it does not store transactions, only resultant state). The node should store it.
GetTopLayer(): the node should always know the top layer in state, and synchronizes this with the VM every time it calls NewLayer()
Rollback(txdata): we never roll back a single transaction. The VM internally rolls back state after attempting to apply an ineffective or failed transaction.
ValidateNonceAndBalance(): currently used in state projection, but no longer needed in account unification. VerifyTransaction() should be used instead, as verification logic is entirely contained in the VM.

Implementation plan

With respect to the svm module:

Move all txpool/mempool-related code into the mempool module.
Remove HandleGossipTransaction/HandleSyncTransaction. These belong either in the mempool module or else in a new module that contains gossip handlers.
Add node state to track which layer txs were applied (this could just be a receipts database). Modify node API to read this state (rather than talking to VM).
Add node state to track current VM top layer, and to compare it every time a new layer is applied. Modify node API to read this state (rather than talking to VM).
Replace all calls to ValidateNonceAndBalance with VerifyTransaction.
Plug transaction selection logic into this API.
Finally, implement a new VM module that implements the above API in native go (and can be swapped for another FFI-based API implementation in future).

Questions

Does ExecuteTransaction need to call VerifyTransaction internally? It probably does, since it cannot execute a transaction without verifying it first.
Is there a distinction between reading the state root of a layer that's finalized and one that's not? E.g., state root of a very old layer, vs. a recent, old layer, vs. the current top layer. It's probably orthogonal to this API.
Should VerifyTransaction return only the principal, or the principal and the nonce? In other words, do we want a separate Compare method, or do we want the nonce comparison logic to live in the node?

Dependencies and interactions

Stakeholders and reviewers

@tal-m
@noamnelke
@neysofu
@lrettig

Testing and performance

tbd

This generally looks good, but I have a few comments:

// Finalizes the previous layer and initializes a new layer. Rewards are applied
// immediately at the start of the new layer.
// Returns an error if layerID != the VM's opinion of the next layer.
func (vm *VM) NewLayer(layerID types.LayerID, rewards map[types.Address]uint64) error

We can't apply rewards at the start of a layer because it's impossible to predict which transactions will end up being applied (and thus what the total fee will be). OTOH, if you meant that rewards are applied at the start of the next layer, that's delaying them unnecessarily. I think rewards should be applied at the end of each layer - after the other transactions.

I also think that this method should return the state root after applying the layer. This is in addition to GetLayerStateRoot which we may want to use for retroactively querying.

On terminology: I would call this CommitLayer. I feel that it better conveys, in an intuitive way for developers, what this does.

// Used in the rare case where full self healing requires that we revert state (and re-apply
// from a previous layer).
// Returns an error if the rewind fails, or if layerID >= the current top layer.
func (vm *VM) Rewind(layerID types.LayerID) (types.Hash32, error)

It should be possible to call this with the current top layer in order to roll back any uncommitted changes.

It should be mentioned that any uncommitted changes are always rolled back when this method is called, regardless of the layer provided (perhaps unless an error is returned, in which case we probably don't want anything to change).

// Compares the ordering of two transactions from the same principal (based on nonce).
// Returns 1 if tx A should come first, -1 if tx B should come first, 0 if there is no ordering,
// and an error if either tx is syntactically invalid or if they do not have the same principal.
func (vm *VM) Compare(txdataA []byte, txdataB []byte) (int, error)

Is this in order to abstract away the concept of a nonce from the node? I'm not sure it's enough. The mempool needs to be able to efficiently evict transactions with the same or lower nonce once a transaction is applied. If this method is all that's available then we'd need to compare with all transactions from the same principle that are currently in the mempool in order to know what to evict. It's possibly not the end of the world, but it would be more efficient if we had a linearized representation of the nonce that we can index based on and I think this is possible to achieve.

Alternatively, we should let the node handle the nonce without abstracting it away. This mostly has disadvantages if we don't plan to implement the generalized nonce scheme before genesis, but I think that we actually do. Curios to hear other opinions on this.

Additional note: the node will ultimately need to order transactions for the block, not only the mempool. It may prove to be confusing to have this ordering method and then apply additional ordering logic on top. Not sure about this, but I intuitively like having this logic entirely in the node.

// Executes the transaction against the current state and returns a receipt.
// Called as part of optimistic transaction filtering, and when applying blocks from verified layers
// based on hare and/or tortoise output.
// Note that the transaction is an opaque byte array.
// Return error if the transaction is syntactically invalid.
func (vm *VM) ExecuteTransaction(txdata []byte) (types.Receipt, error)

This should return an error if the transaction is contextually invalid, as well. Contextual invalidity is either a bad nonce value or insufficient balance to cover max gas. These checks should always be performed right before execution, regardless of any previous checks.

I think that we should list the possible errors that this method returns (ideally we'd list the errors that all methods return, but I think this one is particularly important).

ApplyLayer(): the node should call NewLayer() (with layer rewards), followed by ExecuteTransaction() for each transaction in sequence

I think this flow is broken. It means that the new transactions don't get applied until the next layer is ready to be applied. That's also why I proposed to rename NewLayer to CommitLayer - it makes the correct flow obvious: call ExecuteTransaction repeatedly for all transactions in a block, then calculate the rewards and call CommitLayer.

I think this also answers this question:

Is there a distinction between reading the state root of a layer that's finalized and one that's not? E.g., state root of a very old layer, vs. a recent, old layer, vs. the current top layer. It's probably orthogonal to this API.

Since executing a block means executing all transactions and immediately calling CommitLayer we should never need to (or be able to) ask for the layer hash of an uncommitted layer. It makes little sense, since this hash will not include the rewards and thus never be the final state of the layer.

Missing methods

I'm missing a method for getting a transaction's max balance transfer, which we need for the conservative balance calculation.

Given how we want to handle nonces, max gas and gas price - there should be a method(s) that returns those as well.

I think we should have a single method that returns nonce bytes, nonce mask, max gas, gas price and max balance transfer given a transaction. I think executing this method should be cheaper than executing VerifyTransaction and we'll probably want to use this information to decide if we even want to spend the resources to verify the transaction. E.g. if we have a higher gas paying competing candidate for the same nonce in the mempool - we'll just want to discard this, regardless of validity. Same goes for insufficient balance - it's an easy check.

spacemeshos / SMIPS