Increasing client functionality while synching full node

Disclaimer

What follows is not a proposal but rather just me wondering about what seems to be a natural idea for improving usability while waiting for a full node to synch. I don't understand the security implications of enabling these changes and I am half expecting there to be some clear problems about it, so I am posting this here is the hopes of understanding the possible drawbacks.

TL;DR: Include key image set (KIS) commitment in block headers to allow for transaction validation while full node is still synching, and optionally add also a commitment to the one-time address set (OTAS) to headers so that new nodes can get their wallet balances fast. That is, as a regular validator validates the chain, he should include or expect to see in each block header a commitment for the state at that height. In something like Bitcoin that would roughly correspond to the UTXO, and in Monero that would be the KIS and OTAS.

Brief overview of the issue

From what I understand, once a new node joins the network it will roughly ask its peers for the current height and accumulated difficulty, and following that it will receive batches of 10 thousand block header hashes, and then batches of 20 blocks at a time that match those hashes. It will verify the contents of those blocks for proper signatures, possible double spends etc. As it goes it builds the key image set (KIS) that it uses to prevent double spending of funds going forward.

This process can take up to a few days, which is a very long time in any case, but it can be especially frustrating for new users.

Redundant verification and immutability

The verification of the blocks contents is an expensive process that is repeated by every full node, and the further deep a block is buried in the blockchain, the harder it is for an attacker to change its contents.

What I am wondering is: can we rely on this immutability through PoW to start operating as a validating node after verifying the PoW, but before completely validating all the blocks contents?

The goal is for a fresh new node to know, with high probability: (i) that it is in possession of the right block-headers chain; and (ii) its own wallet balance.

Intuitively, if we know that we are looking at the longest PoW chain, then we may be content to verify that a properly signed transaction references only coins created in blocks that are buried at least N blocks deep, and that there is no double spend happening.

So, if a ring contains a coin C1 that was created by a transaction T1 that is buried at depth N1 >= N, then we are satisfied; if it also contains a coin C2 created in a transaction T2 buried at depth N2 < N, then repeat this process until verifying each coin's genealogy up to depth N. If all the coins involved can have their genealogy established in this way, and there is no double spend detected, then the transaction is deemed valid and is relayed to other peers.

Here imagine that N is big enough that there has been enough time for all the full nodes on the network to have validated the entire contents of the blockchain, its PoW etc, and resolved any crisis situation from an attack or big reorg.

How to check for double spend attempts?

In the process described above, the new node would have to somehow know what the KIS was at present height. One way to do that would be to have other nodes provide it to him. But how does he know that the reported KIS is the right one?

If he could check the KIS commitment (say its hash) somewhere authoritative, then it would just be a matter of checking that for the reported KIS he is getting from his peers. So a natural way to solve this would be to include, in each block header a field where the block producer would record the KIS commitment at that height. And of course requiring by consensus rule that the validators check that that is correct.

So if the fresh new node wants to start the "fast synch" from depth N, it would (momentarily) trust immutability of the chain and not check the coins genealogy beyond depth N (at first), and also, it would ask for the KIS at depth N. Then he would ask for all the blocks from depth N up to current height (and validate its contents) to get the new key images.

It will still work as a full node

The process described is only meant to give the new node some functionality while it is getting fully synched. That is, while it is doing its validation up to depth N, it will also be querying for the remaining blocks like it would normally, it is just that in the normal setup it would be unusable during the synching stage.

Note that it doesn't matter the order in which it validates the blockchain, although starting from most recent blocks makes more sense since that is where most of the ring members are sampled from, and presumably most users will be looking for coins recently received more often than scanning for really old coins.

Getting the balance

During synching, the new node can already start looking for coins belonging to its wallet, and being in possession of the KIS from the beginning at least guarantees to the user that whatever coins the wallet finds, its provisional balance can only increase, since the wallet can know immediately if a newly found coin has already been spent.

In fact here, if the goal is to make it possible for the user to find its balance right away, then another piece of information that could be added to the block header is the commitment to the one-time address set (OTAS), that is the list of all public keys, and transaction or output key, (P,R) associated with each coin in existence up to that block's height.

If the new node can query for the OTAS at depth N, then it can know even before start validating the chain where its coins are, and whether they are spent (from KIS). It may not be good practice from a privacy point of view to ask directly for the specific blocks containing those coins only, but at least the node knows what information it is most interested in and may get those coins blocks sooner.

Some notes:

(i) I use the term commitment, as opposed to hashing, because I am not sure that simply hashing a big piece of data at every block is necessarily the best way to do it. There may be other ways to structure this commitment that may be more efficient, given that the KIS and OTAS may just increase a tiny bit from one block to another.

(ii) If this idea happens to be sound, a similar approach could work in Bitcoin by committing the UTXO in each block instead. The question of how to commit the UTXO may be more interesting since the UTXO constantly has elements added as well as removed from it.

(iii) There is still the matter of simply validating the PoW on the purported longest chain. In my experience (~80 hps), it might take more than 6 hours just to completely validate the hashes. I am not sure what can be done about this, short of checkpointing (which I believe is already done) and using a higher hash rate PoW algorithm going forward.

monero-project / research-lab

Increasing client functionality while synching full node #54