decred / dcrd

Decred daemon in Go (golang).
https://decred.org
ISC License
740 stars 292 forks source link

Add GetMultipleCFiltersV2 P2P message #3206

Closed matheusd closed 6 months ago

matheusd commented 1 year ago

Summary

The GetMultipleCFiltersV2 P2P message would be used by SPV clients to request multiple CFilters from a full node. Instead of issuing individual GetCFilterV2 messages (one for each block), each GetMultipleCFiltersV2 would (implicitly or explicitly) list a block range, making the corresponding MultipleCFiltersV2 message return a batch of CFilters.

The main goal for this new interaction is to reduce the time needed to sync SPV clients to the main chain. A secondary goal is to reduce load on both clients and full nodes caused by the excessive amount of messages currently needed to fetch all CFilters.

This issue is being opened to track discussion of this new P2P message prior to opening a full PR.

Suggested Message Names

Rationale

Related to https://github.com/decred/dcrwallet/issues/2289.

Some informal profiling shows that the biggest bottleneck for the initial sync in SPV-mode dcrwallet is the fetching of cfilters. Due to the current limitation of one CFilter per message, the wallet needs to issue 2000 GetCFilterV2 messages per batch of headers, before proceeding to the next one.

This causes a bottleneck due to the need to encode and send each message, then wait for and decode each reply. On dcrd's side, this also causes an unnecessary load due to the corresponding process.

Considering that during the intial sync, the SPV client is issuing the request to fetch cfilters in ascending chain order, it would be better to send a single message to fetch a batch of CFilters in sequence, instead of fetching a single one.

Using a rough implementation of this message (along with the corresponding dcrwallet changes) reduces the time for initial sync from over 4 minutes to less than 2 minutes (using a single remote dcrd as source of chain data for the wallet), while also reducing object allocation count (and thus memory and cpu load) on the dcrd node.

Range Specification

The range of the batch to fetch from the node on a single GetMultipleCFiltersV2 should be specified with a starting and ending block hash (inclusive). The blocks inside this window would be returned in the reply.

It is an error to specify a starting block hash that is not an ancestor of the ending block hash.

Max reply size

The main design question for this P2P message is the max number of requested and returned cfilters in a batch.

This should be defined based on the max possible size of an individual cfilter, the hard size limit of P2P messages (32MB), the size of the batch of headers read on each getheaders request (2000), the overhead needed for CFilter proof metadata (1066 bytes per filter) and an affordance to avoid excessive load on nodes.

There are different ways to calculate the max possible size of individual cfilters, based on the assumptions made of worst case scenario for blocks. The following subsections detail the rationale for each method of calculating this figure.

Absolute Worst Size

With a max block size of 1310720 bytes (1.25MiB), the script size that maximizes both N (the number of entries in a GCS-based Cfilter) and the size of final filter is 3[1]. With this known script size, up to 436906 different scripts may be added to the filter.

Given both a script size and number of scripts N and analyzing the GCS alogirithm, to maximize the filter size we must ensure every script added results in a unique internal filter value and create the largest possible quotient between consecutive values. Assuming we could do so[2], the largest quotient possible (after the modulo reduction) for 3-byte scripts would be ((N*M-1)-(N-1)) >> 19 = 654107[3].

Summing everything up, the maximum theoretical filter size is ceil((N*19 + N + 654106)/8) + 5[4], which is 1174034 (about 1.12 MiB). We can double check this amount by simulating the process for constructing the worst case filter[5].

While it's not actually possible to create such a large filter for a block due to consensus restrictions on block and transaction layouts, this serves and the absolute upper bound for determining the max number of filters to include in a MultipleCFiltersV2 reply. For a 32MiB max message size, this would be 28 filters.

Non-Standard Max Filter

Given the consensus restrictions on max tx size (1 MB), we need to use 2 transactions to fill the max block size of 1.25 MiB. Discounting the size used by the required coinbase, treasurybase and votes transactions and assuming the minimum amount of everything else for the transactions, we can create up to 93467[6] outputs with a 3-byte PkScript (i.e. 2 byte OP_RETURNs, which are included in CFilters).

Plugging this amount of scripts into the prior calculation for max filter size, we get a max size of 251164 bytes (about 245KiB) or up to 133 filters per 32 MiB P2P message.

Note that this would be the max theoretical size for a block that uses non-standard transactions, and would therefore require miner cooperation to be created.

Standard OP_RETURN based filter

Standard (i.e. relayable) transactions have a limit of 4 null data outputs per regular transaction. This implies creating a large amount of transactions to fill the CFilter with 3 byte scripts.

Splitting a block into 4-output, 3 bytes per output transactions, we can have up to 34264 scripts. Again plugging this into the prior calculation, we get a max size of 92078 bytes (~90 KiB) with up to 360 filters per P2P message.

Standard P2SH Filter

The smallest standard, non-null-data script type is the P2SH script, which uses 23 bytes. By creating two transactions that maximize the number of P2SH scripts in a block, we can create up to 38486 scripts.

This gives a maximum filter size of 103423 bytes (~101 KiB) or up to 321 filters per P2P message.

Note that for the current mainnet max block size (393216), this gives a max output count of 11504 outputs, a max cfilter size of 30918 bytes and up to 1049 filters per message.

Current Largest Filter

As of this writing, the largest on-chain CFilter has 11936 bytes and is found on block 806513. Assuming this size, it would be possible to send up to 2580 filters in a single message.

Note that this can't actually be used as a limit, as larger filters are possible. This is included only as a reference to show how far away the current filters are from the theoretical maximum.

Choice of Max Size

The following table summarizes some choices to impose as max number of cfilters per message, along with the size of the P2P message under various scenarios:

Number Max Possible Size Max Non Std Size Max P2SH Size Max Current Size
25 28 MiB 6 MiB 2.5 MiB 317 KiB
50 - 12 MiB 5 MiB 634 KiB
100 - 24 MiB 10 MiB 1.2 MiB
200 - - 20 MiB 2.5 MiB
400 - - - 5 MiB
1000 - - - 12.4 MiB
2000 - - - 24.8 MiB

Note: Cells with a - mean a message that would overflow the maximum P2P message size.

Given the above summary, a choice of max cfilter count of either 50 or 100 seems appropriate. While these two choices make a message filled with a set of the largest theoretical CFilter unrelayable in the current P2P network, this seems of little practical relevance. The choice of a max of 100 cfilters per message means that each batch of 2000 headers requires 20 such messages to be fulfilled.

If we choose to be more aggressive and disregard the risk of a malicious miner mining multiple consecutive blocks with an intention to exceed the max message size, then a value of 200 may be used as well.

A max number of cfilters per message over 400, while not a problem on mainnet given the current usage pattern and limits, could become a problem in the future if the block size is increased. It could, however, be a problem for testnet.

Finally, a max cfilter count over 1000, while not exceeding any limits on the current mainnet usage patterns, may become a problem in case the blocks start getting very full.

Rough WIP Implementation

A rough, first pass implementation may be found here: https://github.com/decred/dcrd/compare/master...matheusd:dcrd:test-multi-cf?expand=1. Note that this implementation is not using hash based ranges for the cfilters, but rather using the mainchain.

The main workhorse is the LocateCFiltersV2 function added to the blockchain package, responsible for fetching a batch of cfilters from the db and preparing the response message. A few tricks are used to reduce the total number of allocated objects.

Footnotes

[1] This can be determined by noting that a script size of 1 or 2 bytes do not have enough entropy (only up to 2^16 or 65536 different values in case of 2 bytes) to generate enough unique GCS entries after the siphash and modulo reduction stages.

[2] In practice this is a probabilistic process due to the siphash stage, and the fact that the set of all 3-byte scripts cannot possibly generate every value between (0,N*M-1).

[3] One way of visualizing this is that, given N, the way to maximize the quotient is to select N-2 scripts that result in the smallest values after the modulo reduction (thus "bunching up" all of these N-2 scripts with the smallest possible difference), then picking the script that results in the largest value after modulo reduction (which, if we could generate values at will, would be the value NM-1). The quotient between N-2 and NM-1 is the largest possible given the previously determined parameters and results in the largest amount of bits written using the Golomb/Rice coding.

[4] 19 bits per entry, plus 1 bit per entry (the zero bit in unary coding), plus the maximum amount of one bits to represent the quotient, plus 5 bytes for the varint to store N.

[5] Code to show how to build the largest possible cfilter: https://github.com/decred/dcrd/commit/c463dccb383e5ca9f2c71a648c3799dd67df382e

[6] The absolute minimum transaction size (excluding outputs) is 97 bytes. Then, each 3-byte PkScript output requires 14 bytes total (11 for the fixed size output data and 3 for the script itself). Around 1977 bytes are necessary per block for coinbase, treasurybase and vote transactions.

davecgh commented 1 year ago

For anyone reading later, we had a discussion about this in the development channel on matrix since github was down at the time. The summary is that there doesn't seem to be anything in here that would be a show stopper and it looks great overall.

The one thing we'll want to make sure to do is add an init assertion on the block size that explains this message MUST be changed in tandem with any block size changes. The motivation is because the size limitations are based on the current block size and that would be a something very easy to forgot (or discover at all for someone new) when changing block sizes in the future. An init assertion will make the code refuse to run if the block size is ever changed to address that.