parallelchain-io / pchain-mempool

Rust implementation of a ParallelChain protocol Mempool.
Apache License 2.0
0 stars 0 forks source link

v0.1 design #1

Closed GoodKlaus closed 5 months ago

GoodKlaus commented 1 year ago

Mempool Design Report

1. Requirements

Note

  1. Don't care about transactions that are already executed by other nodes, just pop.
  2. Don't care about transactions that are from "future" (discontinuous nonce), just pop.

2. Mempool

pub(crate) struct Mempool<N: NonceProvider> {
    txns_from_accs: HashMap<PublicAddress, BtreeMap<Nonce, Transaction>>,
    ranked_accs_by_priority_fee: PriorityQueue<PublicAddress, PriorityFee>,
    committed_nonce_provider: N,
    num_of_txns: u64,
    size_in_bytes: u64,
    max_size: u64
}

2.1 new() method

pub(crate) fn new(committed_nonce_provider: N, max_size: u64) -> Mempool<N>

Algorithm Pseudocode

function new(committed_nonce_provider, max_size):
    return Mempool instance with all fields initialized

2.2 insert() method

pub(crate) fn insert(&mut self, txn: Transaction) -> Result<(), UnacceptableNonceError>

Algorithm Pseudocode

function insert(self, txn): 
    // Preliminary checkings
    txn_bytes = byte length of the serialized transaction
    if self.size_in_bytes + txn_bytes > max_size:
        return Full Mempool Exception
    if txn.signer in self.txns_from_accs && self.txns_from_accs[txn.signer].contains_key(txn.nonce) && self.txns_from_accs[txn.signer].get(txn.nonce).tx_hash == txn.hash:
        return Duplication Exception
    committed_nonce = self.committed_nonce_provider.get(txn.signer)
    if txn.nonce < committed_nonce:
        return Nonce Too Low Exception

    if txn.signer exists in self.txns_from_accs:
        signer_txns = self.txns_from_accs[txn.signer]
        signer_txns.insert(txn.nonce, txn)
        self.ranked_accs_by_priority_fee.change_priority(signer, signer_txns.first_key_value().1.priority_fee)
    else:
        self.txns_from_accs.insert(txn.signer, new BTreeMap of txn)
        self.ranked_accs_by_priority_fee.push(txn.signer, txn.priority_fee)

    self.size.num_of_txns += 1
    self.size.size_in_bytes += length of serialized txn

Complexity Alalysis

Algorithm Pseudocode

function pop(self, this_base_fee):
    if self.size.num_of_txns == 0:
        return None

    popped_items = []
    popped_signer = None
    while let Some((signer, priority_fee)) = self.ranked_accs_by_priority_fee.pop():
        if self.txns_from_accs[signer].front().base_fee >= this_base_fee:
            popped_signer = signer
            signer_txns = self.txns_from_accs[signer]
            popped_transaction = signer_txns.pop_first().1
            self.size.num_of_txns -= 1
            self.size.size_in_bytes -= length of serialized popped_transaction

            if signer_txns.is_empty():
                self.txns_from_accs.remove(signer)
            else:
                self.ranked_accs_by_priority_fee.push(signer, signer_txns.first_key_value().1.priority_fee)
            break
        else:
            popped_items.push((signer, priority_fee))
    push items in popped_items back to self.ranked_accounts_fee

    return popped_transaction

Complexity Analysis

3. NonceStore

pub(crate) trait NonceProvider: Send + Clone + 'static {
    fn get(&self, from_addr: PublicAddress) -> Nonce; 
}
#[derive(Clone)]
pub(crate) struct NonceStore(pub BlockTreeDB);

3.1 get() method

fn get(&self, from_addr: PublicAddress) -> Nonce

Algorithm Pseudocode

function get(self, from_addr):
    retrieve highest committed block hash from the BlockTree snapshot
    retrieve state hash by using the highest committed block hash
    open committed world state with state hash
    return the nonce of “from_addr” from committed world state

4. Removing Stuck Transactions

pub(crate) fn remove_txns(&mut self, threshold_size: u64) -> Vec<Transaction>

4.1 Process

4.2 Algorithm Pseudocode

function remove_txns(self, target_size):
    dropped_txns = []
    while self.size.num_of_bytes > target_size:
        selected_signer = a random signer from self.ranked_accs_by_priority_fee
        dropped_txn = self.txns_from_accs[selected_signer].pop_back()
        if self.txns_from_accs[selected_signer] is not empty:
            update self.ranked_accs_by_priority_fee with new entry transaction
        else:
            remove selected_signer from self.txns_from_accs and self.ranked_accs_by_priority_fee
        dropped_txns.push(dropped_txn)
        update self.num_of_txns and self.size_in_bytes
    return dropped_txns

5. Mempool Read Access Handling

5.1 Description

5.2 Handling

Note accuracy vs. promptness

  • Take snapshot of mempool's num_of_txns and size_in_bytes periodically e.g. every 5 seconds.
  • Future fullnode endpoints can read the snapshot directly.
lyulka commented 1 year ago

Comments in design meeting 2

  1. The new design will still drop concurrently submitted transactions if the transactions arrive at the mempool at a non-ascending order. This is already significantly better than the current design, but if GossipSub's ordering guarantees are weak, may still cause a large number of concurrently submitted transactions to be dropped. Further research needs to be done into GossipSub to determine its ordering guarantees.
  2. The new design drops popped transactions whose nonce is smaller than the speculative nonce. This may be too eager, since transactions with nonces between the committed nonce and the speculative nonce can become part of other branches.
  3. Most of the complexity in the new design is related to "filtering" transactions. Every filtering mechanism needs to have a purpose. We identified two possible purposes: "security", and "convenience". We agreed that "security" is hard to achieve in mempool and adds a lot of complexity, and that "security" is probably best achieved through non-mempool mechanisms. In particular, I suggested considering removing the "upper bound" from the insert transaction nonce check.
  4. The new design's memory exhaustion prevention is too complex implementation and interface-wise and has loopholes. As an alternative, I suggested removing random transactions on insert if the mempool is too full to accept the new transaction.
  5. Consider making Filter a generic type parameter or an associated type. (e.g., Mempool<F>). This allows us to change the filtering functionality of the mempool in the future without changing much of the type's basic insert and pop logic.
  6. We agreed to keep mechanisms intended to enable concurrent access to the mempool outside the pchain-mempool crate.
  7. Using a RwLock over mempool and frequently read locking to read its size may crowd out threads that need a write lock to use the mempool (e.g., the insertion thread and the executor).
lyulka commented 1 year ago

Comments in design meeting 3

  1. Representing “filtering” logic into a Filter trait and having Mempool impl Filter has minimal utility and just adds complexity.
  2. Putting transactions back into the channel between RPC and Mempool if the Mempool is full defeats the purpose of having a maximum size in the first place: to limit memory pressure.
  3. Putting popped Transactions that are not included in a block because they are below their signer’s speculative nonce although above its committed nonce back into the same channel is inelegant, and not worth the benefit because cases where this leads to the transaction being lost is rare.
  4. To reduce read contention, snaphost Mempool’s size statuses periodically and let RPCs read this snapshot instead of having to get a read lock on Mempool every time.
  5. Be thoughtful in how you select random signers to drop their highest-nonced transaction from the Mempool.
  6. Field, argument, and function names should be longer and more descriptive.