clemahieu commented 2 years ago

Overview:

This issue describes a rework of the synchronization process inside the nano node. In contrast to previous synchronization methods which followed a top-down approach, this new mode is a bottom-up approach which has significant advantages over previous versions.

The current implementation and working draft PR is located here: https://github.com/nanocurrency/nano-node/pull/3808

The priority of design goals for this rewrite are: reliability, simplicity, reduced resource usage, speed.

Problems with the existing design:

The existing bootstrap mechanism serves blocks in height-descending order which matches the way data is stored in the ledger but is the reverse order in which the blocks need to be added.

Reversing the order of blocks in responsibility of the client and when tracing dependencies between accounts, reordering transactions consumes a significant amount of disk space, disk IO both of which contribute to a long run time. This can be observed when initially bootstrapping a node which can take days or more to complete.

The original reason for serving blocks in a top-down fashion is that the bootstrap server can trace the sequence of blocks to serve by following the previous link in each block and thus, eliminating the need to externally store forward links for each block. However, for a long time the ledger has already tracked each block’s forward link in the 'successor' in the block sideband information so there is no longer an actual storage savings achieved from this original design.

Major changes:

This new bootstrap mode requires server-side support for a new flag on the bulk_pull message named “ascending”. If this flag is set, the bootstrap server will serve blocks in ascending-height order rather than descending-height order as it has historically.

Retrieving blocks in ascending order removes the requirement of having a large client buffer available for accepting blocks in reverse order of which they were originally added.

The new code is both easier to understand and significantly less complex.

At its core, the ascending bootstrap method is a random process that requests a small number of successor blocks for a randomly chosen account from the ledger.

The number of blocks requested is small, which means TCP connections don’t need to remain open or occupied for long periods of time for long account chains. Being less reliant on a single connection for account chains improves redundancy and reduces resources consumed from a particular bootstrap server.

Design:

The set of accounts to select from is the union of the ledger’s account set and destination accounts within pending entries. Since the account selection process is random, multiple bootstrap processes can run in parallel with no coordination. Additionally, since the account selection process is random, it is more resistant to any ledger geometries that would be degenerate cases within the existing bootstrap methods.

An account number is selected at random and a small number of blocks are requested in ascending order using bulk_pull. Random selection was used instead of table iteration for a couple reasons. Random selection doesn’t need coordination between multiple parallel bootstrap processes which makes it easier to start additional instances in parallel. When iterating, we need to keep track of the position in the tables which increases complexity.

As the number of accounts grows this process in its basic form slows down, however, two existing optimizations keep it performannt.

Request backoff:

When a node responds to a bulk_pull request indicating there are no more blocks or the blocks given don't connect to the ledger, it is indeterminate if this is would also be the case with an identical request to other nodes.

At any given time we can’t rule out that an account has generated a new block that we need to request but the probability of there being additional blocks for an account decreases every time there is an empty response.

Account progress forwarding:

The account selection process starts out random, however, we can make use of information coming out of the block processor to hint us on accounts that have recently had blocks inserted. Rather than wait for these accounts to be randomly selected, we can queue and make further requests immediately.

Source-blocked accounts optimization:

When an account tries to insert a block that fails because the source block is unsatisfied, there is no reason to try this account again until the source block is inserted to the ledger. The bootstrap process will track which accounts are source-blocked and not request additional blocks for the account until the dependency has been satisfied.

Remaining work:

Node integration:

The current implementation is not fully integrated with the node. For convinience of debugging it's implemented as a type-level replacement for the existing legacy bootstrapper and doesn't have any stop conditions or duty cycle programmed in.

Two options exist for lifetime of this new bootstrapping method. One option is to follow the existing pattern of bootstrapping attempts having a discrete lifetime with a start and stop point. Another option is for this bootstrapping method to run continiously with its duty cycle adjusted according to bootstrapping needs.

Peer version detection:

The new ascending flag necessitates a new protocol version to indicate support for this flag. Currently the node does not detect whether the counterparty supports the ascending flag and might receive blocks in legacy descending order.

Multithreading:

The current implementation runs in a single thread. From the standpoint of random account selection, multithreading will be fairly easy to add. There are two optimization that require consideration when adding support for multiple threads. The recent progress account forwarding optimization should only forward these accounts to a single thread otherwise they'll be requesting duplicate data. The source-blocked account optimization should block an account for all threads, not just the thread that discovered the blocked account.

Persistent backoff weights:

Currently the backoff weights are stored in a memory map. With ~30M accounts currently in the ledger and 32 (account bytes) + 1 (backoff count) requires ~1GB of memory. In addition to being stored on disk because of its size, we don't want to loose the backoff weights between runs as it will waste bandwidth attempting accounts that had a high backoff weight in previous runs.

Testing:

Current testing has been performed between local nodes which operates in a quiet enviroment suitable for performance profiling without the noise of an internet-connected network. The bootstrap ascending flag was added to the latest dev build for the beta network so it should be possible to start testing on beta as nodes start to support that flag.

qwahzi commented 2 years ago

Thanks for the write-up, great stuff!

Is the intent for this to completely replace top-down (height-descending) bootstrapping, or to be used in parallel with (maybe as the default) bottom-up (height ascending) bootstrapping, depending on the situation?

For light nodes that don't plan to keep the entire ledger, it seems like top-down bootstrapping straight into a pruned state would drastically reduce network requests and bootstrap time, at the cost of slightly more required trust than a bottom-up full ledger bootstrap