kaspanet / rusty-kaspa

Kaspa full-node and related libraries in the Rust programming language. This is a Beta version at the final testing phases.
ISC License
350 stars 105 forks source link

Address Tracker for subscriptions to changed UTXOs notifications #427

Closed tiram88 closed 1 month ago

tiram88 commented 2 months ago

Subscriptions to UtxosChanged notifications is getting a deep revamp optimizing memory consumption (highly) and execution speed (moderately).

Address Tracker

The subscription sub-system uses a new ubiquitous SubscriptionContext, containing essentially an address Tracker whose role is to track addresses involved in UtxosChangedSubscriptions, indexing them and counting references. All subscriptions use now address indexes instead of ScriptPublicKeys, reducing memory consumption significantly.

The Tracker features an optional maximum address count. When provided, it allows to

Kaspad Argument

The maximum tracked addresses can be defined in the kaspad command with argument --max-tracked-addresses with default value set to 1,835,007.

The component used internally (an IndexMap) allocates maps with a capacity computed with following formula:

((max_addresses + 1) * 8 / 7).next_power_of_two() * 7 / 8

The value passed by argument always gets expanded to the actual usable allocated capacity. For a target of 1M addresses the expanded capacity is 1,835,008. The tracker reserves the last entry for address recycling so the actual maximum is 1,835,007.

Subscription propagation and memory optimizations

Every Notifier is provided a new UtxosChangedMutationPolicy that defines how an incoming UtxosChangedScope mutations will be propagated to its parents. The policy supports two modes: fine-grained AddressSet or all or nothing Wildcard.

The notification system is configured to propagate address sets upstream from all RPC Servers up to the Index Processor. This appears to be the optimal balance in memory consumption + processing effort between upstream subscriptions and downstream notifications.

The processing of an incoming UtxosChangedScope mutation with its embedded address vector is refactored in a way that retains the same vector all the way up from RPC server to the index processor, helping reduce memory fragmentation.

A Notifier and its Broadcasters now share the same subscriptions instances, significantly reducing the allocated memory for large address sets.

UtxosChangedSubscription

The struct is refactored so that it gets inner mutability of its data field (containing state & address index set) and an immutable listener ID (used to identify the instance, notably for equality).

A subscription has 3 states: None, Selected and All. The last 2 are indicative of an active subscription. Selected indicates an address set.

In a Broadcaster, the subscriptions with Selected mode are used as such in the broadcasting Plan, meaning that 2 listeners of this state are always considered distinct. The rationale here for not trying to compare their respective address set for possible equality is that the this outcome has a probability close to null while the computing cost of the comparison is high.

In a Broadcaster, all subscriptions with All state are grouped under a unique ubiquitous instance provided by the SubscriptionContext, allowing to mutate the state of a subscription without affecting the broadcasting Plan.

Benchmarking

New framework

A new test framework allows the quick and idiomatic setup of an integration test featuring ie. a node, a miner and any additional task the test needs, like submitting large sets of transactions (mempool benchmarking) or cycles of subscriptions/unsubscriptions for many clients at once.

As an example, the mempool benchmark setup gets refactored as:

let client_manager = Arc::new(ClientManager::new(args));
let mut tasks = TasksRunner::new(Some(DaemonTask::build(client_manager.clone())))
    .launch().await
    .task(
        MinerGroupTask::build(
            network,
            client_manager.clone(),
            SUBMIT_BLOCK_CLIENTS,
            params.bps(),
            BLOCK_COUNT,
            Stopper::Signal
        ).await,
    )
    .task(
        TxSenderGroupTask::build(
            client_manager.clone(),
            SUBMIT_TX_CLIENTS,
            false,
            txs,
            TPS_PRESSURE,
            MEMPOOL_TARGET,
            Stopper::Signal,
        ).await,
    );
tasks.run().await;
tasks.join().await;

Memory benchmarking of UTXOs changed subscriptions

A set of benchmarks is added, dedicated to measuring the memory consumption in various UTXOs changed subscriptions setups. In all cases, 500 clients connect to a mining node, processing ~100 TPS. 495 client are considered "wallets" and each do subscribe to 800 unique addresses. The 5 clients left are "monitoring services" that subscribe to addresses sets of length 200K to 1M by 200K increments, sets mainly overlapping with the wallets addresses.

Clients run subscription cycles of predefined duration. During 2/3 of the cycle, the client is subscribed to its addresses and the remaining 1/3 of the time, the client is fully unsubscribed.

The benchmark runs until 1.5M transactions are mined. The cycle duration appears to have some moderate impact on the overall benchmark duration. The sending and processing of the 500 subscriptions all in a row slows down the node TPS a bit during ~30 seconds, probably because the network bandwidth and gRPC server are getting briefly saturated.

The benchmark is structured into a client process spawning a child server process. The parent process runs all client activities: mining, submitting transactions and the 500 gRPC clients subscribing to UtxosChanged notifications cyclically. The server runs a node. This architecture has the desired property of having the client and the server running each in isolation, in particular each managing its own memory.

Memory consumption of the server process is tracked every 5 seconds all along the benchmark run. An average value is then computed on the data excluding the first 60 minutes considered as warmup.

The various benchmark setups are:

# Subscriptions cycle Average memory consumption in GB Delta vs A (GB)
A No subscriptions at all (aka. control sample) 2.7
B Single cycle of infinite duration (no unsubscribing) 3.1 0.5
C 2 hours cycles 3.8 1.1
D 30 minutes cycles 4.0 1.3
E 3 minutes cycles 5.4 2.8