Validators Cluster with High Availability

jolestar commented 1 year ago

Is it "High Availability" or "Decentralize"? @templexxx

templexxx commented 1 year ago

Is it "High Availability" or "Decentralize"? @templexxx

@jolestar Im trying to satisfy both.

High Availability:

coordinator cluster protected by Paxos for making transaction pool and managing workers(validators)
each validator is sequencer too, gets transaction from coordinator run parallelly

Decentralization:

security: I think if I can provide a way which is as safe as public blockchain that would be enough. As we all know, blockchain is not 100% secure. (e.g. transaction pool may drop transaction, crazy guy could destory the system regradless of financial losses)
permissionless: pledge for being a validator, two-round pledge for being coordinator. Only validators who have been approved by DAO have the opportunity to become coordinator through second round pledge. So the community could monitor coordinator. Except for the nodes of rooch dao (to ensure availability), other nodes will have a rotation system to join.

Just a breif above. Tons of details haven't been shown here.

Issues still need to be closed:

Model of Out-of-Order Dispatcher: it will be something like block-stm but working with cluster way. And I need to test it for ensuring the throughput
How to use DA: It seems that most of DA couldn't satisfy our performance & technology needs. I want to use smart contract which providing two-authtication for avoiding coordinator & sequencer cheating/being evil. It can be used to check whether the coordinator's dispatching meets the expected weight settings and to avoid unauthorized block packing by the sequencer that leads to block numbering exceptions for other sequencers. Further more, I wouldn't wait for DA's confirmation, several blocks will be making at the same time. We could check the final state and rollup, but not being stall. DA is just a role in pipeline. We have coordinator to maintain transaction's state which being packed which means it could provide in-time protection.

p.s.

For sequencer switching controled by smart contract on Layer1:

Poor availability: Single node availability is about 99.9% . ~500min/year is collapsed.
Impossible smooth sequencers switch: Cannot implement a trustless & permissionless heart-beat mechanism or the system is too slow (as slow as layer1) that would help nothing to monitor validators' health
Dropping transactions could happen without penalising: just like what could happen in so-called centralized coordinator.
Lower incentives: Higher avaliability and performance will attract more users, with higher TVL, validator will be more motivated to do better. For horizon scaling design, every validator could have chance to execute transaction at the same time, and more profits if they have better hardware/software than others. This is a positive circular mechanism.
Conclusion: Lower performance, equal security (or even less, because the incentive is not as strong as the cluster version)

lshoo commented 1 year ago

I think decentralized systems are generally highly available, and the security of decentralized systems is corresponding to the consistency in CAP theory

templexxx commented 1 year ago

I think decentralized systems are generally highly available, and the security of decentralized systems is corresponding to the consistency in CAP theory

@lshoo It's a CP system. Need effort to achieve higher availablity. And for permissionless & trustless, we need mechnism of BFT, transcendental or experiential way.

lshoo commented 1 year ago

It's a CP system. Need effort to achieve higher availablity. And for permissionless & trustless, we need mechnism of BFT, transcendental or experiential way.

Yes, both centralized and decentralized systems are solved by consensus algorithms are solved by writing (mint block) or writing to confirm validity, regardless of Raft or BFT, while decentralized systems also need anti-cheating or proof of means to prevent evil

lshoo commented 1 year ago

Blockchain is a decentralized system version of the CQRS architecture system, the decentralized execution layer is equivalent to the service cluster in CQRS to execute transactions (write command), from the perspective of modular blockchain, does data availability belong to the data layer?

jolestar commented 1 year ago

We could design a protocol for sequencer switching, and the two epochs have time overlap for the sequencer to do some conformation and information exchange, making the switching more smooth.

But this protocol requires the two sequencers to cooperate.

templexxx commented 1 year ago

We could design a protocol for sequencer switching, and the two epochs have time overlap for the sequencer to do some conformation and information exchange, making the switching more smooth.

But this protocol requires the two sequencers to cooperate.

two nodes cannot achieve BFT, ex-sequencer could cheat next one some transaction it hasn't executed but already done actually. No one could give a proof that ex-sequencer is evil because we cannot trust anyone's evidence. We cannot assume that no one would do such damage maliciously, not to mention that the cost of such damage is 0. So If we use this way, the idea of mutual cooperation must be abandoned. Only we can do is switching by smart contract directly.

Besides that I've post my thoughouts of this design:

For sequencer switching controled by smart contract on Layer1:

Poor availability: Single node availability is about 99.9% . ~500min/year is collapsed.
Impossible smooth sequencers switch: Cannot implement a trustless & permissionless heart-beat mechanism or the system is too slow (as slow as layer1) that would help nothing to monitor validators' health
Dropping transactions could happen without penalising: just like what could happen in so-called centralized coordinator.
Lower incentives: Higher avaliability and performance will attract more users, with higher TVL, validator will be more motivated to do better. For horizon scaling design, every validator could have chance to execute transaction at the same time, and more profits if they have better hardware/software than others. This is a positive circular mechanism.

Conclusion: Lower performance, equal security (or even less, because the incentive is not as strong as the cluster version)

Or we could have a BFT system among sequencers. But that will be another Layer1.

templexxx commented 1 year ago

These original and immature discussions above will go into the archives.

The questions inside it will be split into different issues.

rooch-network / rooch-network.github.io

Validators Cluster with High Availability #42