neo-project / neo

NEO Smart Economy
MIT License
3.47k stars 1.03k forks source link

Using NeoFS to store blocks and snapshots #3463

Open roman-khimov opened 3 months ago

roman-khimov commented 3 months ago

Background

Currently C# node can synchronize only via P2P, going through the whole chain from genesis to the current block. And C# node will store all of the state locally. NeoGo can be a bit more fancy, it can strip old transactions and MPTs if configured appropriately (still limited by #2026, but anyway), see RemoveUntraceableBlocks and GarbageCollectionPeriod in https://github.com/nspcc-dev/neo-go/blob/8ea0bc6e58dc7809d4784c6975839ab57b98c26e/docs/node-configuration.md. It also can synchronize state via P2P (as explained in #2373, see P2PStateExchangeExtensions option), this feature is heavily dependent on #1526 which is also a NeoGo extension (StateRootInHeader option).

Now we know that #1526 will be solved eventually by adding state roots into the new block header version. This will make it possible to implement #2373 and other state-stripping options can be implemented as well making the node capable of dropping a lot of data. Imagine it's all done and we have a much better node.

Still, new nodes will synchronize via P2P and this means that they'll need all headers to be available via P2P. Which means other nodes can't drop them. Each header is ~700 bytes, so it gives more than a GB of data in a year even with 15s blocks. Trying to reduce the block time multiplies these numbers easily and blocks never stop being added.

Then these headers are processed and we move on to MPT fetching from #2373. It works well, but it's somewhat inefficient in terms of exchanged data, MPT is bigger than the state itself and we need a lot of messages to make it happen, so it'll take time and bandwidth.

After that blocks can fetched and processed in a regular way. Some number of them will be required to catch up with other nodes. This adds some load to P2P, but it's tolerable, at least that's what we have happening now anyway.

Remember also that after #1526 solution we'll have a problem of StateValidator role, it will no longer be useful, state signed by consensus nodes is more reliable than state signed by some other set of nodes. This makes the role somewhat deprecated.

Blocks stored in NeoFS

It's rather obvious that blocks can be put into NeoFS just like any other data and the concept was mentioned numerous times by different people in many discussions. What I want to change with this proposal is to make it practical and show what we can achieve by going this route.

We've done some experiments in https://github.com/nspcc-dev/neo-go/issues/3496 and I can say already that the scheme of node synchronization with blocks fetched from NeoFS is absolutely viable. There will be some minor tweaking of the exact scheme used to store blocks, but the essence of it is that we store two objects for one block: block itself and its header. Given proper metadata we can avoid using any additional indexes (thanks to https://github.com/nspcc-dev/neofs-api/pull/285 implemented some time ago), but if index objects turn out to be more efficient we can still add them). The basic fetch loop for a node with current height N looks like this:

A separate header object allows to process header chain in case we'd like to synchronize state via other means (like #2373). But while fetching is rather easy, there is a big question of how objects end up being in this container and why we should trust it as a source of blocks.

On one hand, blocks are self-descriptive and can be checked easily (yeah, blockchain) and fetching them via P2P can't be worse than any other way because P2P is like talking to random nodes to get some blocks, they can lie about header hashes, they can lie about block contents. Still, this mostly works because any lie can be discovered quickly. At the same time, we can't allow any random node to PUT blocks into container that others will use as a trusted source for synchronization.

StateValidator role

If we have some task to be performed for the network regularly we can have a role for that with a set of known keys behind that role. We can add a new role, but given that StateValidator role won't be used soon and the thing we're going to do is semantically related to state synchronization (see below) this role can just be reused.

So we can have a NeoFS container that has a policy with N (to be discussed) replicas stored on StateValidator nodes (it can be guaranteed thanks to https://github.com/nspcc-dev/neofs-node/blob/d668214a3653b17d528fd72899bbd69708712104/docs/verified-node-domains.md) and M (also to be discussed) elsewhere. This way we know that SV nodes will always have our data and it won't disappear, but other nodes can participate as well (notice that storage is paid for, so both are incentivized for doing that).

Then there is a question of object PUTs. Ideally we'd like PUTs to be restricted to a BFT multisignature of SV nodes so that a single node couldn't PUT an object into this container. Unfortunately, right now NeoFS doesn't allow for that, accounts are single-key only. This will eventually be fixed, but that's somewhere 2025. What we can have now is that we would allow any of SV nodes to PUT. This won't cause any duplicates since PUTting the same object with the same meta leads to the same OID, but there are two forms of mischief available to SVs in this case:

The first one won't affect synchronization, but will affect payments, SVs could try to earn more than they should by pushing garbage into this container. The risk is rather low and can be controlled by monitoring.

The second one can affect synchronization which is somewhat worse, but at the same time still not the end of the world, nodes will refuse invalid blocks and if they can get proper ones they will eventually get them. This also can be easily monitored and invalid SV node can be detected/punished. So to me it's acceptable to not have a multisig PUT for an initial implementation of the scheme. An upgrade is always possible.

This already solves some problems like we no longer have to store block headers on most nodes, they could always be fetched from NeoFS. And this scheme can be implemented right away. But it can also be extended.

Snapshots

Remember #1286, in many ways it's an alternative to #2373. #2373 is focused on P2P synchronization and MPT traversal for known state root (hi, #1526), we know it works, but it's not very fast and efficient. But if we're to have a container controlled by SVs and storing blocks, the same container can be reused to store state dumps made by SVs. SVs are not CNs, they have some spare cycles and they're motivated to create these dumps (more data to store, more fees collected for the storage).

We've done some experiments around snapshotting in https://github.com/nspcc-dev/neo-go/issues/3519 and we know it takes minutes to create them. So something like hourly or daily snapshots are absolutely feasible. They won't require any additional signatures since calculating MPT over them would yield the same state root hash that we're to have in headers after #1526. We can require exchanging checksums between SVs before PUTting this object to make sure they all agree on the contents, but that's optional.

With that in place we'd get an ability to synchronize nodes much faster (even compared to #2373):

It should be noted that the scheme (or any other state synchronization scheme like #2373) can be greatly improved with lower MaxTraceableBlocks settings. Not storing/processing 2M of blocks/headers (like we require now for mainnet/testnet) would be hugely beneficial.

The only time-consuming thing left for nodes there is header processing. In some ways it's inevitable, we're doing blockchain here and if we want to trust it we need to process all of it. In some cases there is a potential for improvement though and there are even cases where it'd be required to implement.

Trusted checkpoints

All of N3 chain trust builds upon BFT number of signatures for headers that are referenced by the NextConsensus field. We know standby validators from the config, we trust them by definition and then we check the next block to be signed by NextConsensus of the current one which has a new NextConsensus and so on. Verifying 5/7 ECDSA multisig takes some time, so we can't go much faster than ~2000 headers per second. Which is almost an hour and a half for 10M blocks. Just checking headers, not doing anything else. It's kinda OK for our current networks that grow by 2M of blocks yearly, but:

Just think of the last possibility, whole network is built upon the trust in standby keys. If they're lost, it's trivial and pretty fast to create an alternative chain of any length with any transactions. Not likely to happen, not likely to be undetected (everyone knows the canonical chain, right?), but still a possibility. Which then raises a question of how to choose a proper chain. I doubt there is any other solution to that than just declaring some network to be a canonical one which means saying "block XYZ at height N is OK". Likely that'd be a part of the configuration, like a trusted checkpoint of some kind, but the same mechanism allows to skip header verification of all previous blocks.

This is not strictly NeoFS-related and I think this part will be scrutinized the most, but still this mechanism (if it's considered to be acceptable for users) can make synchronization even faster (combined with all of the previous parts).

The end goal is to be able to join the network quickly and have minimal amount of data stored locally. Long-term this is important to me, chains can only grow, but the tail is only interesting for audit or archival purposes. Most nodes should operate with the recent state only and luckily NeoFS can help here by storing the tail for us in a reliable distributed manner.

Where in the software does this update applies to?

igormcoelho commented 3 months ago

I think it's a reasonable and nice alternative to P2P syncing. Perhaps larger chunks of blocks will sync faster, but it's just a guess... the ability to fetch individual blocks from an alternative storage network can be quite useful.

roman-khimov commented 3 months ago

I have a bit more of wild ideas regarding what can be stored there, but I'm holding them back for now, things currently laid out here are the most important, it's needed for NeoFS itself (remember, there is a bit of NeoFS in Neo and there is a bit of Neo in NeoFS).