Subscriptions to UtxosChanged notifications is getting a deep revamp optimizing memory consumption (highly) and execution speed (moderately).
Address Tracker
The subscription sub-system uses a new ubiquitous SubscriptionContext, containing essentially an address Tracker whose role is to track addresses involved in UtxosChangedSubscriptions, indexing them and counting references. All subscriptions use now address indexes instead of ScriptPublicKeys, reducing memory consumption significantly.
The Tracker features an optional maximum address count. When provided, it allows to
The value passed by argument always gets expanded to the actual usable allocated capacity. For a target of 1M addresses the expanded capacity is 1,835,008. The tracker reserves the last entry for address recycling so the actual maximum is 1,835,007.
Subscription propagation and memory optimizations
Every Notifier is provided a new UtxosChangedMutationPolicy that defines how an incoming UtxosChangedScope mutations will be propagated to its parents. The policy supports two modes: fine-grained AddressSet or all or nothing Wildcard.
The notification system is configured to propagate address sets upstream from all RPC Servers up to the Index Processor. This appears to be the optimal balance in memory consumption + processing effort between upstream subscriptions and downstream notifications.
The processing of an incoming UtxosChangedScope mutation with its embedded address vector is refactored in a way that retains the same vector all the way up from RPC server to the index processor, helping reduce memory fragmentation.
A Notifier and its Broadcasters now share the same subscriptions instances, significantly reducing the allocated memory for large address sets.
UtxosChangedSubscription
The struct is refactored so that it gets inner mutability of its data field (containing state & address index set) and an immutable listener ID (used to identify the instance, notably for equality).
A subscription has 3 states: None, Selected and All. The last 2 are indicative of an active subscription. Selected indicates an address set.
In a Broadcaster, the subscriptions with Selected mode are used as such in the broadcasting Plan, meaning that 2 listeners of this state are always considered distinct. The rationale here for not trying to compare their respective address set for possible equality is that the this outcome has a probability close to null while the computing cost of the comparison is high.
In a Broadcaster, all subscriptions with All state are grouped under a unique ubiquitous instance provided by the SubscriptionContext, allowing to mutate the state of a subscription without affecting the broadcasting Plan.
Benchmarking
New framework
A new test framework allows the quick and idiomatic setup of an integration test featuring ie. a node, a miner and any additional task the test needs, like submitting large sets of transactions (mempool benchmarking) or cycles of subscriptions/unsubscriptions for many clients at once.
As an example, the mempool benchmark setup gets refactored as:
Memory benchmarking of UTXOs changed subscriptions
A set of benchmarks is added, dedicated to measuring the memory consumption in various UTXOs changed subscriptions setups. In all cases, 500 clients connect to a mining node, processing ~100 TPS. 495 client are considered "wallets" and each do subscribe to 800 unique addresses. The 5 clients left are "monitoring services" that subscribe to addresses sets of length 200K to 1M by 200K increments, sets mainly overlapping with the wallets addresses.
Clients run subscription cycles of predefined duration. During 2/3 of the cycle, the client is subscribed to its addresses and the remaining 1/3 of the time, the client is fully unsubscribed.
The benchmark runs until 1.5M transactions are mined. The cycle duration appears to have some moderate impact on the overall benchmark duration. The sending and processing of the 500 subscriptions all in a row slows down the node TPS a bit during ~30 seconds, probably because the network bandwidth and gRPC server are getting briefly saturated.
The benchmark is structured into a client process spawning a child server process. The parent process runs all client activities: mining, submitting transactions and the 500 gRPC clients subscribing to UtxosChanged notifications cyclically. The server runs a node. This architecture has the desired property of having the client and the server running each in isolation, in particular each managing its own memory.
Memory consumption of the server process is tracked every 5 seconds all along the benchmark run. An average value is then computed on the data excluding the first 60 minutes considered as warmup.
The various benchmark setups are:
#
Subscriptions cycle
Average memory consumption in GB
Delta vs A (GB)
A
No subscriptions at all (aka. control sample)
2.7
B
Single cycle of infinite duration (no unsubscribing)
Subscriptions to
UtxosChanged
notifications is getting a deep revamp optimizing memory consumption (highly) and execution speed (moderately).Address Tracker
The subscription sub-system uses a new ubiquitous
SubscriptionContext
, containing essentially an addressTracker
whose role is to track addresses involved inUtxosChangedSubscriptions
, indexing them and counting references. All subscriptions use now address indexes instead ofScriptPublicKeys
, reducing memory consumption significantly.The
Tracker
features an optional maximum address count. When provided, it allows toKaspad Argument
The maximum tracked addresses can be defined in the kaspad command with argument
--max-tracked-addresses
with default value set to 1,835,007.The component used internally (an
IndexMap
) allocates maps with a capacity computed with following formula:The value passed by argument always gets expanded to the actual usable allocated capacity. For a target of 1M addresses the expanded capacity is 1,835,008. The tracker reserves the last entry for address recycling so the actual maximum is 1,835,007.
Subscription propagation and memory optimizations
Every
Notifier
is provided a newUtxosChangedMutationPolicy
that defines how an incomingUtxosChangedScope
mutations will be propagated to its parents. The policy supports two modes: fine-grainedAddressSet
or all or nothingWildcard
.The notification system is configured to propagate address sets upstream from all RPC Servers up to the Index Processor. This appears to be the optimal balance in memory consumption + processing effort between upstream subscriptions and downstream notifications.
The processing of an incoming
UtxosChangedScope
mutation with its embedded address vector is refactored in a way that retains the same vector all the way up from RPC server to the index processor, helping reduce memory fragmentation.A
Notifier
and itsBroadcaster
s now share the same subscriptions instances, significantly reducing the allocated memory for large address sets.UtxosChangedSubscription
The struct is refactored so that it gets inner mutability of its data field (containing state & address index set) and an immutable listener ID (used to identify the instance, notably for equality).
A subscription has 3 states:
None
,Selected
andAll
. The last 2 are indicative of an active subscription.Selected
indicates an address set.In a
Broadcaster
, the subscriptions withSelected
mode are used as such in the broadcastingPlan
, meaning that 2 listeners of this state are always considered distinct. The rationale here for not trying to compare their respective address set for possible equality is that the this outcome has a probability close to null while the computing cost of the comparison is high.In a
Broadcaster
, all subscriptions withAll
state are grouped under a unique ubiquitous instance provided by theSubscriptionContext
, allowing to mutate the state of a subscription without affecting the broadcastingPlan
.Benchmarking
New framework
A new test framework allows the quick and idiomatic setup of an integration test featuring ie. a node, a miner and any additional task the test needs, like submitting large sets of transactions (mempool benchmarking) or cycles of subscriptions/unsubscriptions for many clients at once.
As an example, the mempool benchmark setup gets refactored as:
Memory benchmarking of UTXOs changed subscriptions
A set of benchmarks is added, dedicated to measuring the memory consumption in various UTXOs changed subscriptions setups. In all cases, 500 clients connect to a mining node, processing ~100 TPS. 495 client are considered "wallets" and each do subscribe to 800 unique addresses. The 5 clients left are "monitoring services" that subscribe to addresses sets of length 200K to 1M by 200K increments, sets mainly overlapping with the wallets addresses.
Clients run subscription cycles of predefined duration. During 2/3 of the cycle, the client is subscribed to its addresses and the remaining 1/3 of the time, the client is fully unsubscribed.
The benchmark runs until 1.5M transactions are mined. The cycle duration appears to have some moderate impact on the overall benchmark duration. The sending and processing of the 500 subscriptions all in a row slows down the node TPS a bit during ~30 seconds, probably because the network bandwidth and gRPC server are getting briefly saturated.
The benchmark is structured into a client process spawning a child server process. The parent process runs all client activities: mining, submitting transactions and the 500 gRPC clients subscribing to UtxosChanged notifications cyclically. The server runs a node. This architecture has the desired property of having the client and the server running each in isolation, in particular each managing its own memory.
Memory consumption of the server process is tracked every 5 seconds all along the benchmark run. An average value is then computed on the data excluding the first 60 minutes considered as warmup.
The various benchmark setups are: