neo-project / neo

NEO Smart Economy
MIT License
3.47k stars 1.02k forks source link

How to sign the stateroot? #1526

Open Tommo-L opened 4 years ago

Tommo-L commented 4 years ago

Q1: Should stateroot be part of core or plugin?

As we discussed at last night meeting, most people agreed that It should be part of the core to ensure the data consistency of the all cn nodes.

Q2: Which method should we use to sign state root?

This is still under discussion, there are some options:

What do you think?

igormcoelho commented 3 years ago

It seems to me that community on general already prefers the most widely adopted solution, which is putting hash on the header. We need to move on, and I propose to rethink this in the future, as soon as we have a first working solution with verifiable state, into some other (and better) approach with more flexible states.

So, I agree with moving on with hash in headers approach,regarding height H-1.

igormcoelho commented 3 years ago

So, if we are talking about verifiable states, let's please secure all important states (otherwise approach must be useless):

Honestly, it's much more important to secure Notification states than Storage states. Notifications are the real "interface" between blockchain world and real world. Storage has random prefixes, and are non-standard, no exchange would trust that. Notifications are the "real thing", so it's likely that almost no one will check storage states on practice (developers could always publish and compare ad-hoc states, as we've been doing for a while to ensure that code is correct).

And please, let's discuss and adopt neo-project/proposals#97 on wallets, for both Neo2 and Neo3, so that regular users will never have issues with funds, regardless of MPT or not. And even if exchanges don't use that, for performance issues, it doesn't matter! Exchange sends me funds without verification, and in next moment, I send the funds again TO MYSELF, including verification. So users can always ensure that their funds exist and are safe, on Neo2 and Neo3, regardless of any MPT (wallet guys, please help me out).

Tommo-L commented 3 years ago

Long discussion, at least one step forward.

We'll be doing some experiments with the impact of putting StateRoot in block header on consensus nodes, particularly in terms of performance and storage.

If it doesn't have much impact, we can put StateRoot in the Header.

Another problem is that wherever StateRoot is located, we face the problem of readjusting contract storage fee. https://github.com/neo-project/neo/pull/1441

Tommo-L commented 3 years ago

Storage Fee

At present, it is difficult to cause the storage attack, the attack cost is relatively high.

StateRoot had a small impact on TPS

Use regular nep5 transfer for testing node performance.

Two nodes running on the same computer, one for sending transaction, and another for consensus. Experimental environment:

Parameter Value
CPU 6CPU i7
Memory 8G
OS Window10
Version Neo3-neox-header

image

image

From the experimental results, we can see that StateRoot had a relatively small impact on performance.

StateRoot had a significant impact on storage

Parameter Value
CPU 4 CPU i5
Memory 16G
OS Mac
Version Neo3-neox-header

Case (1)

Each transaction writes 60KB to the storage area, the size of transaction is 200 bytes, and a block is about 100Kb, and a total of 30M of data is written to a block. Actually, each block only takes about 2Mb. The first write consumes 8972.37 GAS, and the subsequent repeat write consumes 2243.93 GAS.

block height 31195 31262 31355 31390 31501 31568 31635 31767
Full state node (Mb) 1694 1809 1949 2038 2303 2444 2601 2869
Only keep the latest state node (Mb) 161 166 197 218 187 216 222 232

image

According to the principle of MPT and the above observation data, it can be found that the storage growth of full state nodes is approximately proportional to the tx volume. The simple formula is as follows:

image

The compression ratio is about 11:1

Case (2)

Using the full data of Neo2.x, the storage test for each type of node as following:

Node type 6 300 000(block height)
Neo2.x without stateroot 19 Gb
Neo2.x only keep the latest state node 21G~30 Gb
Neo2.x full state node 107 Gb

Summary:

  1. StateRoot had a large impact on full-state nodes, and the growth was approximately proportional to the tx volume
  2. Only keep the latest state node, which will be less affected
  3. StateRoot's performance impact is not significant(under the known data volume of neo2.x)
shargon commented 3 years ago

I think that we should only keep the latest state node, what do you think?

Tommo-L commented 3 years ago

For cn and normal nodes can do that. But we still need some full state nodes for generating the proof path.

Tommo-L commented 3 years ago

Can we start putting the state root in the header? Or any other idea?

shargon commented 3 years ago

Header it's better for me. I can start with it if you want @Tommo-L

roman-khimov commented 3 years ago

Can we start putting the state root in the header?

We sure can. Neo3-neox-header can be a starter branch for this, although from what I see #1996 probably needs to go in first to simplify reviewing neox-header changes.

Tommo-L commented 3 years ago

@erikzhang dalao, can we start putting state root in header now? What do you think?

erikzhang commented 3 years ago

I still think that state root should not be put in the header. Because this prevents us from repairing the state without regenerating the blocks. In addition, if we use consensus nodes with different implementations, block generation may stop frequently in the future.

Tommo-L commented 3 years ago

if we use consensus nodes with different implementations, block generation may stop frequently in the future.

That's a real problem, and we're facing it now. Maybe when it's stable, we can rethink about whether to put it back in the header.

Anyway, we should do it as soon as possible.

roman-khimov commented 3 years ago

I think we're one step back now as both points had been discussed previously.

igormcoelho commented 3 years ago

@Tommo-L It would be nice to have some observing list of "strange states" out there, including poor implementation and bad versioning. Specially for tokens, some random Solid-state calls stored on chain may prevent these nodes from syncing.

Tommo-L commented 3 years ago

Some other options we can consider in the future:

roman-khimov commented 3 years ago

Some other options we can consider in the future:

These are different approaches to how to calculate state, and I'm open to any better suggestions wrt this topic. But this is a separate topic from where to store state data and it's the most important one. I think the advantages of header-included state root are well covered here, so let's concentrate on possible negative effects. Performance and size concerns are now ruled out by https://github.com/neo-project/neo/issues/1526#issuecomment-723948719 (and https://github.com/neo-project/neo/issues/1526#issuecomment-712997942), so we have just two problems left:

Repairing the state First of all, this claim is based on assumption of possibility to replace state root chain (if it is separate from the block chain) and thereby "repair" it. But in fact doing so creates more problems than it solves:

It essentially is a fork, although in a state chain and forks are better be avoided (at least in case of Neo). And it also is changing the game rules while playing which is not a good practice. Instead, if we're in the situation where we need to repair the state we need to... repair it actually. Not replace it, not rerun old transactions under new rules, not fork the state (or block) chain, but repair the state we've got. And this can be done following the scheme from https://github.com/neo-project/neo/issues/1526#issuecomment-643286102 with explicit committee-signed corrective transactions that will fix the state. It of course also needs execution environment versioning, but that's not a big problem either.

Potential consensus failures It is true that consensus can fail if CNs are not to agree on the current state. But sorry, it is a feature.

First of all, can state difference cause consensus to fail now? Yes it can. There is a potential for this even now, without any state roots. If CNs are to disagree on GAS contract state they can fail primary node proposal validation and depending on mempools this situation can happen again and again after view changes. But good luck debugging that without clear state data.

Second, how likely in practice are we going to see complete consensus failure (network stop)? Not really likely. For it to happen we need at least f + 1 nodes having different state from the majority of CNs. If we're talking about single implementation running on all nodes that means f + 1 nodes failing in a strange way (being alive, but with incorrect state) simultaneously, not really likely. If we're talking about different implementations that means that we need at least f + 1 non-C# nodes on the network and I just don't see it at the moment. If that ever to be the case I think we'll have mature enough implementations by that time to make it less likely. What's more likely is for one node (irrespective of its implementation) to go wild which is quickly detected and fixed. That's what we actually want for network's reliability, quickly detect misbehaving CNs and fix them.

Third, even if we're to imagine CN state split the question is --- is it better to continue running the network with unknown state or fix the problem and continue running properly synchronized network? This is exactly the case where stopping the network is appropriate, because doing so can prevent bigger problems (like the ones requiring state repair). Remember #1989, it's not a critical bug (luckily), but it is a perfect demonstration of the problem that can be detected and prevented by (different implementations of) CNs exchanging their state during consensus process. Continuing running the network unsychronized can potentially lead even to accepting wrong blocks. And that is a catastrophic scenario for the network, especially the one declaring one block finality. This must be prevented even if it costs network outage.

So I'm absolutely sure that exchanging state data during consensus process (and including it in the header) actually improves the network reliability. We're likely to see some CNs go wild occasionally, but this will be a signal, either CN needs to be fixed, or there is an implementation problem, or there is some protocol problem that resulted in this, but without this signal we're just blind to these problems. We're not likely to see f + 1 CNs having different state, but if we're to see that --- there should be some reason and we better deal with that reason, again, not crossing fingers and hoping that this is fine.

On the importance of this issue for Neo 3 IMO this issue is absolutely critical to solve before Neo 3 launch exactly because it is a cornerstone, it's easy to have it since block 0 and it's not that simple to add it at any other block after the genesis. It directly affects header format and rolling out this change would require changing any client/node implementation with the need to keep backwards compatibility at the same time (we're not starting a new chain because of this). It can be done, but it's better be avoided, that's why I think leaving this question open for post-3.0 is not an option. There are questions that can be left post-3.0 easily, but not this one.

roman-khimov commented 4 months ago

Regarding this and #2905/#2974. The way to do this is via block version one with an additional field that works the same way NeoGo StateRootInHeader option does. This means calculating and storing MPT root for the block. But if we want 3.7 to be the last resynchronized version then we have a problem since MPT is only calculated by the StateService plugin which 99.9% of people don't use. Even if we're to integrate it into the core in 3.8 this would still mean resynchronization at least to get the state. So either we have 3.8 with resync or we should integrate MPT into 3.7 to be calculated already (even though it's not a part of the block yet).

shargon commented 4 months ago

Regarding this and #2905/#2974. The way to do this is via block version one with an additional field that works the same way NeoGo StateRootInHeader option does. This means calculating and storing MPT root for the block. But if we want 3.7 to be the last resynchronized version then we have a problem since MPT is only calculated by the StateService plugin which 99.9% of people don't use. Even if we're to integrate it into the core in 3.8 this would still mean resynchronization at least to get the state. So either we have 3.8 with resync or we should integrate MPT into 3.7 to be calculated already (even though it's not a part of the block yet).

We will need to resync in 3.8 or we will delay 3.7 forever...

roman-khimov commented 1 week ago

An non-resynchronizing update can be performed with a trick: