Transaction Selection - Githubissues

Overview

Transaction selection in a block chain protocol such as Ethereum is relatively straightforward and may be efficiently accomplished with a simple greedy algorithm: e.g., a miner should choose the subset of transactions from its mempool that maximizes the fees collected (i.e., the maximum number of non-conflicting, high fee-paying transactions that fit into a single block).

Transaction selection in a block mesh such as Spacemesh, by contrast, is not at all straightforward. This is due to the fact that a transaction in one block may duplicate or invalidate a transaction in another block in the same layer. Even with perfect control over the transactions in each candidate block, this is a knapsack problem and the optimal solution is NP-complete. Of course, we do not have perfect control over how miners manage their mempools and choose transactions for new blocks. We need a reasonable solution that is computationally feasible and works in the presence of Byzantine behavior among a minority of miners. We also need to consider the fact that different miners may choose to implement different mempool management policies and may therefore have different views on the set of unprocessed transactions at any given time.

Goals and motivation

maximize total miner profit from transaction fees (i.e., fees collected minus cost)
maximize the number of valid, unique transactions included in each layer or, equivalently, minimize the degree of duplication and the number of invalid transactions included in each layer
subject to these constraints, select the simplest possible algorithm that requires the least computational and bandwidth burden on the part of miners and other network participants, and that produces the smallest possible mesh

High-level design

Smeshers select transactions, create blocks, sign and publish "proposals" (messages that propose a list of transactions to be included in a layer).
When constructing a proposal, a smesher should start by selecting viable transactions from the mempool. This is done by iteratively picking the transaction with the highest gas price that hasn't been picked yet. When there's more than one transaction with the same gas price, the first one received locally is picked. The max_gas of all picked transactions is tallied and it must be kept below the block_gas_limit, a configurable value. If the "best" transaction doesn't fit, the next best is attempted until no additional transaction in the mempool can fit in the picked set. At this point, a random subset of proposal_size transactions out of the picked set is selected and put in the proposal.
Before the Hare certification round, Hare participants perform a unification stage locally to deterministically construct a block from the agreed upon proposals. They then certify the hash of the unified block. See https://github.com/spacemeshos/SMIPS/issues/68
Smeshers now vote for this unified block in their ballots rather than for a subset of blocks that were generated by individual smeshers.
The syncer only needs to fetch blocks. Proposals are ephemeral, like Hare messages: they are gossiped and can then be discarded after the Hare certification is published for a given round. They do not need to be stored in the mesh and do not need to be synced.
As long as our safety assumptions hold and all ballot publishers are honest, there should only be one block per layer. Tortoise should only generate votes for one block per layer, but note that it could generate votes against many bad blocks per layer.

Proposed implementation plan

Update mempool:
- Add an integer to each transaction indicating the order in which they were received.
- Support iterating through the mempool ordered by gas price and order received. This can be done by maintaining a relevant index.
Update block builder to implement above transaction selection algorithm, to construct proposal blocks. Note that proposal blocks do not require a base block.
Add an external method to Tortoise to construct a ballot (this can probably replace the existing base block method, as the ballot will vote for the base ballot). Ballots vote for/(against❓)/neutral on blocks for previous layers.
The node should probably construct both the proposal and the ballot at the same time and submit both as a single network message.
Make sure the node only inputs proposals to Hare.
Add UNIFY round to Hare, based on https://github.com/spacemeshos/SMIPS/issues/71.
If Hare fails for a layer, it means there is no block for that layer (this is unchanged).
If Hare succeeds, honest smeshers will only vote for the block it produces in their ballots.
Modify block builder/verifying tortoise base block selection to use the block as its input vector/local opinion. Note that, by having a single block per layer, this proposal reduces the dimensionality of the tortoise vote matrix (see https://github.com/spacemeshos/go-spacemesh/issues/2679) [@lrettig is that the case❓].
Syncer syncs blocks and ballots, but not proposals.

Incentives

As long as an honest majority of miners follows the proposed algorithm, we expect total throughput to be close to optimal. Dishonest miners can do no further harm under this proposal than under Spacemesh 0.1 (where they can already engage in behavior such as censoring transactions or submitting empty blocks). This proposal does not include any changes to rewards or other miner incentives. Further exploration of incentive design will take place as part of #38.

Questions/concerns

Can we eliminate the "explicit vote against" in ballots? A ballot can either vote for a block (in which case it votes against all other blocks in that layer) or abstain from the layer (again, all blocks). Is there a use-case where the votes can't be encoded as suggested here?
Doesn't this mean that Tortoise is now dependent upon Hare?

Yes, but this was already the case prior to this change. Currently, nodes abstain from voting on (recent) layers when Hare fails. See further discussion and ideas on how to mitigate this in the research forum thread.
Which blocks get stored in the mesh?

Voting blocks and superblocks (content blocks). Proposal blocks do not get stored in the mesh.
Do proposal blocks get synced? or only gossipped?

Gossiped, not synced. They do not live in the mesh.
How/when do we calculate rewards? Are they added explicitly as transactions, or are they implicit?
- yes, because proposal blocks are not stored on the mesh
- implicit is a problem in many cases: hare participants are not stored on mesh, (future) validators are not stored, etc., so we need to be explicit.
- another advantage to doing it explicitly: if you later change reward mechanism, you can drop the old code
So why not make proposal blocks very large? e.g., just have each miner include the top N tx in its mempool, then let the unify round pick the best transactions for the unified superblock

We need to keep blocks small to limit required bandwidth, and to make sure that as many nodes as possible can participate in consensus.
Isn't this akin to putting the mempool itself in consensus?

Yes, sort of, but only really in the case where there are less than LAYER_TXCOUNT transactions in the mempool. We're not attempting to keep large mempools in consensus.

Dependencies and interactions

See also: #23, #37, the original proposal
Mempool management: this is related to mempool admittance and eviction (garbage collection) criteria. See https://github.com/spacemeshos/pm/issues/91.
Tortoise data structures: see https://github.com/spacemeshos/go-spacemesh/issues/2679

Stakeholders and reviewers

@noamnelke (overall architecture, transaction syntax, dev task breakdown)
@YaronWittenstein (for SVM integration; does, e.g., calculating tx weight require calling into SVM?)
@lrettig (Tortoise, Hare, research)

Testing and performance

Unit tests will be updated for each modified component. This proposal does not involve writing any new components.
Unit tests should be added to test the data structure that stores transactions in the mempool sorted by weight/density.
Tortoise unit tests should be updated to account for the case where multiple good, and/or bad, superblocks are generated for a given layer (i.e., if assumptions are violated), to ensure that it can recover.
Benchmarks should also be written that test the transaction selection mechanism with mempools of different sizes, and with different mixes of transaction types.
App and/or system tests should be updated, or added, to ensure that miners with identical mempools do not duplicate proposal block content and are able to achieve high throughput: e.g., by ensuring that throughput per layer achieves a certain minimum weight.

spacemeshos / SMIPS

Transaction Selection #50